--- tags: - LunarLander-v2 - ppo - deep-reinforcement-learning - reinforcement-learning - custom-implementation - deep-rl-course model-index: - name: PPO results: - task: type: reinforcement-learning name: reinforcement-learning dataset: name: LunarLander-v2 type: LunarLander-v2 metrics: - type: mean_reward value: -134.81 +/- 132.27 name: mean_reward verified: false --- # PPO Agent Playing LunarLander-v2 This is a trained model of a PPO agent playing LunarLander-v2. # Hyperparameters ```python {'exp_name': 'ppo' 'raw_exp_version': None 'n_hidden_layers': 1 'd_hidden_layers': 64 'env_id': 'LunarLander-v2' 'n_envs': 4 'total_timesteps': 1000000 'batch_timesteps': 128 'n_mini_batches': 4 'n_epochs': 4 'lr': 0.0003 'alpha': 0.2 'gamma': 0.99 'lmbda': 0.95 'normalize_advantages': True 'coef_vf': 0.5 'coef_s': 0.01 'max_grad_norm': 1.0 'seed': 2546260713 'cuda': True 'deterministic': True 'repo_id': 'knight9114/ppo-LunarLander-v2-unit8.1' 'batch_size': 512 'mini_batch_size': 128 'exp_version': 'version_13'} ```