Quentin Gallouédec commited on
Commit
08990fb
1 Parent(s): 103ee13
Files changed (1) hide show
  1. app.py +81 -13
app.py CHANGED
@@ -144,19 +144,6 @@ def get_leaderboard_df():
144
  return df
145
 
146
 
147
- TITLE = """
148
- 🚀 Open RL Leaderboard
149
- """
150
-
151
- INTRODUCTION_TEXT = """
152
- Welcome to the Open RL Leaderboard! This is a community-driven benchmark for reinforcement learning models.
153
- """
154
-
155
- ABOUT_TEXT = """
156
- The Open RL Leaderboard is a community-driven benchmark for reinforcement learning models.
157
- """
158
-
159
-
160
  def select_env(df: pd.DataFrame, env_id: str):
161
  df = df[df["env_id"] == env_id]
162
  df = df.sort_values("mean_episodic_return", ascending=False)
@@ -178,6 +165,87 @@ def format_df(df: pd.DataFrame):
178
  return df.values.tolist()
179
 
180
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
181
  with gr.Blocks() as demo:
182
  gr.HTML(TITLE)
183
  gr.Markdown(INTRODUCTION_TEXT, elem_classes="markdown-text")
 
144
  return df
145
 
146
 
 
 
 
 
 
 
 
 
 
 
 
 
 
147
  def select_env(df: pd.DataFrame, env_id: str):
148
  df = df[df["env_id"] == env_id]
149
  df = df.sort_values("mean_episodic_return", ascending=False)
 
165
  return df.values.tolist()
166
 
167
 
168
+ TITLE = """
169
+ 🚀 Open RL Leaderboard
170
+ """
171
+
172
+ INTRODUCTION_TEXT = """
173
+ Welcome to the Open RL Leaderboard! This is a community-driven benchmark for reinforcement learning models.
174
+ """
175
+
176
+ ABOUT_TEXT = r"""
177
+ The Open RL Leaderboard is a community-driven benchmark for reinforcement learning models.
178
+
179
+ ## 🔌 How to have your agent evaluated?
180
+
181
+ The Open RL Leaderboard constantly scans the 🤗 Hub to detect new models to be evaluated. For your model to be evaluated, it must meet the following criteria.
182
+
183
+ 1. The model must be public on the 🤗 Hub
184
+ 2. The model must contain an `agent.pt` file.
185
+ 3. The model must be [tagged](https://huggingface.co/docs/hub/model-cards#model-cards) `reinforcement-learning`
186
+ 4. The model must be [tagged](https://huggingface.co/docs/hub/model-cards#model-cards) with the name of the environment you want to evaluate (for example `MountainCar-v0`)
187
+
188
+ Once your model meets these criteria, it will be automatically evaluated on the Open RL Leaderboard. That's it!
189
+
190
+ ## 🏗️ How do I build the `agent.pt`?
191
+
192
+ The `agent.pt` file is a [TorchScript module](https://pytorch.org/docs/stable/jit.html#). It must be loadable using `torch.jit.load`.
193
+ The module must take batch of observations as input and return batch of actions. To check if your model is compatible with the Open RL Leaderboard, you can run the following code:
194
+
195
+ ```python
196
+ import gymnasium as gym
197
+ import numpy as np
198
+ import torch
199
+
200
+ agent_path = "path/to/agent.pt"
201
+ env_id = ... # e.g. "MountainCar-v0"
202
+
203
+ agent = torch.jit.load(agent_path)
204
+ env = gym.make(env_id)
205
+ observations = np.array([env.observation_space.sample()])
206
+ observations = torch.from_numpy(observations)
207
+ actions = agent(observations)
208
+ actions = actions.numpy()[0]
209
+ assert env.action_space.contains(actions)
210
+ ```
211
+
212
+ ## 🕵 How are the models evaluated?
213
+
214
+ The evaluation is done by running the agent on the environment for 100 episodes.
215
+
216
+ For further information, please refer to the [Open RL Leaderboard evaulation script](https://huggingface.co/spaces/open-rl-leaderboard/leaderboard/blob/main/src/evaluation.py).
217
+
218
+ ### The particular case of Atari environments
219
+
220
+ Atari environments are evaluated on the `NoFrameskip-v4` version of the environment. For example, to evaluate an agent on the `Pong` environment, you must tag your model with `PongNoFrameskip-v4`. The environment is then wrapped to match the standard Atari preprocessing pipeline.
221
+
222
+ - No-op reset with a maximum of 30 no-ops
223
+ - Max and skip with a skip of 4
224
+ - Episodic life (although the reported score is for the full episode, not the life)
225
+ - Fire reset
226
+ - Clip reward (although the reported score is not clipped)
227
+ - Resize observation to 84x84
228
+ - Grayscale observation
229
+ - Frame stack of 4
230
+
231
+ ## 🚑 Troubleshooting
232
+
233
+ If you encounter any issue, please open an issue on the [Open RL Leaderboard repository](https://huggingface.co/spaces/open-rl-leaderboard/leaderboard/discussions/new).
234
+
235
+ ## 📜 Citation
236
+
237
+ ```bibtex
238
+ @misc{open-rl-leaderboard,
239
+ author = {Quentin Gallouédec and TODO},
240
+ title = {Open RL Leaderboard},
241
+ year = {2024},
242
+ publisher = {Hugging Face},
243
+ howpublished = "\url{https://huggingface.co/spaces/open-rl-leaderboard/leaderboard}",
244
+ }
245
+ ```
246
+ """
247
+
248
+
249
  with gr.Blocks() as demo:
250
  gr.HTML(TITLE)
251
  gr.Markdown(INTRODUCTION_TEXT, elem_classes="markdown-text")