add a trained RL agent in LunarLander-v2 environment (trained using PPO algorithm).
ddbf590
{"mean_reward": 259.250761749301, "std_reward": 10.873408423007765, "is_deterministic": true, "n_eval_episodes": 10, "eval_datetime": "2023-01-13T11:21:29.412466"} |