imflash217's picture
add a trained RL agent in LunarLander-v2 environment (trained using PPO algorithm).
ddbf590
raw
history blame
163 Bytes
{"mean_reward": 259.250761749301, "std_reward": 10.873408423007765, "is_deterministic": true, "n_eval_episodes": 10, "eval_datetime": "2023-01-13T11:21:29.412466"}