MattStammers's picture
Update README.md
d2659cf
|
raw
history blame
1.34 kB
metadata
tags:
  - FrozenLake-v1-8x8
  - q-learning
  - reinforcement-learning
  - custom-implementation
model-index:
  - name: q-FrozenLake-v1-8x8-Slippery
    results:
      - task:
          type: reinforcement-learning
          name: reinforcement-learning
        dataset:
          name: FrozenLake-v1-8x8
          type: FrozenLake-v1-8x8
        metrics:
          - type: mean_reward
            value: 0.09 +/- 0.29
            name: mean_reward
            verified: false

Q-Learning Agent playing1 FrozenLake-v1

This is a trained model of a Q-Learning agent playing FrozenLake-v1 .

Usage


model = load_from_hub(repo_id="MattStammers/q-FrozenLake-v1-8x8-Slippery", filename="q-learning.pkl")

# Don't forget to check if you need to add additional attributes (is_slippery=False etc)
env = gym.make(model["env_id"])

This one is not easy to build with just a Q-table. It has taken a lot of training even to get him to occasionally slip into the prize.

To optimise him even further is probably going to take a different approach. To get this result I trained using the following parameters:

{'env_id': 'FrozenLake-v1',
 'max_steps': 200,
 'n_training_episodes': 1000000,
 'n_eval_episodes': 100,
 'eval_seed': [],
 'learning_rate': 0.9,
 'gamma': 0.99,
 'max_epsilon': 1,
 'min_epsilon': 0.05,
 'decay_rate': 0.0005,