qgallouedec HF staff commited on
Commit
ee1ec66
1 Parent(s): b5570c4

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +1 -61
README.md CHANGED
@@ -5,6 +5,7 @@ tags:
5
  - deep-reinforcement-learning
6
  - reinforcement-learning
7
  - stable-baselines3
 
8
  model-index:
9
  - name: PPO
10
  results:
@@ -20,64 +21,3 @@ model-index:
20
  name: mean_reward
21
  verified: false
22
  ---
23
-
24
- # **PPO** Agent playing **HumanoidStandup-v2**
25
- This is a trained model of a **PPO** agent playing **HumanoidStandup-v2**
26
- using the [stable-baselines3 library](https://github.com/DLR-RM/stable-baselines3)
27
- and the [RL Zoo](https://github.com/DLR-RM/rl-baselines3-zoo).
28
-
29
- The RL Zoo is a training framework for Stable Baselines3
30
- reinforcement learning agents,
31
- with hyperparameter optimization and pre-trained agents included.
32
-
33
- ## Usage (with SB3 RL Zoo)
34
-
35
- RL Zoo: https://github.com/DLR-RM/rl-baselines3-zoo<br/>
36
- SB3: https://github.com/DLR-RM/stable-baselines3<br/>
37
- SB3 Contrib: https://github.com/Stable-Baselines-Team/stable-baselines3-contrib
38
-
39
- Install the RL Zoo (with SB3 and SB3-Contrib):
40
- ```bash
41
- pip install rl_zoo3
42
- ```
43
-
44
- ```
45
- # Download model and save it into the logs/ folder
46
- python -m rl_zoo3.load_from_hub --algo ppo --env HumanoidStandup-v2 -orga qgallouedec -f logs/
47
- python -m rl_zoo3.enjoy --algo ppo --env HumanoidStandup-v2 -f logs/
48
- ```
49
-
50
- If you installed the RL Zoo3 via pip (`pip install rl_zoo3`), from anywhere you can do:
51
- ```
52
- python -m rl_zoo3.load_from_hub --algo ppo --env HumanoidStandup-v2 -orga qgallouedec -f logs/
53
- python -m rl_zoo3.enjoy --algo ppo --env HumanoidStandup-v2 -f logs/
54
- ```
55
-
56
- ## Training (with the RL Zoo)
57
- ```
58
- python -m rl_zoo3.train --algo ppo --env HumanoidStandup-v2 -f logs/
59
- # Upload the model and generate video (when possible)
60
- python -m rl_zoo3.push_to_hub --algo ppo --env HumanoidStandup-v2 -f logs/ -orga qgallouedec
61
- ```
62
-
63
- ## Hyperparameters
64
- ```python
65
- OrderedDict([('batch_size', 32),
66
- ('clip_range', 0.3),
67
- ('ent_coef', 3.62109e-06),
68
- ('gae_lambda', 0.9),
69
- ('gamma', 0.99),
70
- ('learning_rate', 2.55673e-05),
71
- ('max_grad_norm', 0.7),
72
- ('n_envs', 1),
73
- ('n_epochs', 20),
74
- ('n_steps', 512),
75
- ('n_timesteps', 10000000.0),
76
- ('normalize', True),
77
- ('policy', 'MlpPolicy'),
78
- ('policy_kwargs',
79
- 'dict( log_std_init=-2, ortho_init=False, activation_fn=nn.ReLU, '
80
- 'net_arch=dict(pi=[256, 256], vf=[256, 256]) )'),
81
- ('vf_coef', 0.430793),
82
- ('normalize_kwargs', {'norm_obs': True, 'norm_reward': False})])
83
- ```
 
5
  - deep-reinforcement-learning
6
  - reinforcement-learning
7
  - stable-baselines3
8
+ - HumanoidStandup-v4
9
  model-index:
10
  - name: PPO
11
  results:
 
21
  name: mean_reward
22
  verified: false
23
  ---