ppo-LunarLander-v2-4milsteps-200-envs / FinetunedPPO_5mil_steps_total

Commit History

Upload PPO LunarLander-v2 trained agent, used 1 mil more steps with more loose variance hyperparameter.
3120398

alexbalandi commited on