zjowowen commited on
Commit
4b05080
1 Parent(s): 29a8ccd

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +8 -6
README.md CHANGED
@@ -21,7 +21,7 @@ model-index:
21
  type: LunarLander-v2
22
  metrics:
23
  - type: mean_reward
24
- value: -41.76 +/- 128.15
25
  name: mean_reward
26
  ---
27
 
@@ -129,7 +129,7 @@ from huggingface_ding import push_model_to_hub
129
  # Instantiate the agent
130
  agent = MuZeroAgent(env_id="LunarLander-v2", exp_name="LunarLander-v2-MuZero")
131
  # Train the agent
132
- return_ = agent.train(step=int(10000))
133
  # Push model to huggingface hub
134
  push_model_to_hub(
135
  agent=agent.best,
@@ -149,7 +149,7 @@ pip3 install LightZero
149
  repo_id="OpenDILabCommunity/LunarLander-v2-MuZero",
150
  platform_info="[LightZero](https://github.com/opendilab/LightZero) and [DI-engine](https://github.com/opendilab/di-engine)",
151
  model_description="**LightZero** is an efficient, easy-to-understand open-source toolkit that merges Monte Carlo Tree Search (MCTS) with Deep Reinforcement Learning (RL), simplifying their integration for developers and researchers. More details are in paper [LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios](https://huggingface.co/papers/2310.08348).",
152
- create_repo=True
153
  )
154
 
155
  ```
@@ -164,6 +164,7 @@ pip3 install LightZero
164
  exp_config = {
165
  'main_config': {
166
  'exp_name': 'LunarLander-v2-MuZero',
 
167
  'env': {
168
  'env_id': 'LunarLander-v2',
169
  'continuous': False,
@@ -199,6 +200,7 @@ exp_config = {
199
  'collector_env_num': 8,
200
  'evaluator_env_num': 3,
201
  'env_type': 'not_board_games',
 
202
  'battle_mode': 'play_with_bot_mode',
203
  'monitor_extra_statistics': True,
204
  'game_segment_length': 200,
@@ -294,13 +296,13 @@ exp_config = {
294
  - **Demo:** [video](https://huggingface.co/OpenDILabCommunity/LunarLander-v2-MuZero/blob/main/replay.mp4)
295
  <!-- Provide the size information for the model. -->
296
  - **Parameters total size:** 15479.39 KB
297
- - **Last Update Date:** 2023-12-11
298
 
299
  ## Environments
300
  <!-- Address questions around what environment the model is intended to be trained and deployed at, including the necessary information needed to be provided for future users. -->
301
  - **Benchmark:** OpenAI/Gym/Box2d
302
  - **Task:** LunarLander-v2
303
  - **Gym version:** 0.25.1
304
- - **DI-engine version:** v0.4.9
305
- - **PyTorch version:** 2.1.1+cu121
306
  - **Doc**: [Environments link](<TODO>)
 
21
  type: LunarLander-v2
22
  metrics:
23
  - type: mean_reward
24
+ value: 206.55 +/- 102.39
25
  name: mean_reward
26
  ---
27
 
 
129
  # Instantiate the agent
130
  agent = MuZeroAgent(env_id="LunarLander-v2", exp_name="LunarLander-v2-MuZero")
131
  # Train the agent
132
+ return_ = agent.train(step=int(5000000))
133
  # Push model to huggingface hub
134
  push_model_to_hub(
135
  agent=agent.best,
 
149
  repo_id="OpenDILabCommunity/LunarLander-v2-MuZero",
150
  platform_info="[LightZero](https://github.com/opendilab/LightZero) and [DI-engine](https://github.com/opendilab/di-engine)",
151
  model_description="**LightZero** is an efficient, easy-to-understand open-source toolkit that merges Monte Carlo Tree Search (MCTS) with Deep Reinforcement Learning (RL), simplifying their integration for developers and researchers. More details are in paper [LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios](https://huggingface.co/papers/2310.08348).",
152
+ create_repo=False
153
  )
154
 
155
  ```
 
164
  exp_config = {
165
  'main_config': {
166
  'exp_name': 'LunarLander-v2-MuZero',
167
+ 'seed': 0,
168
  'env': {
169
  'env_id': 'LunarLander-v2',
170
  'continuous': False,
 
200
  'collector_env_num': 8,
201
  'evaluator_env_num': 3,
202
  'env_type': 'not_board_games',
203
+ 'action_type': 'fixed_action_space',
204
  'battle_mode': 'play_with_bot_mode',
205
  'monitor_extra_statistics': True,
206
  'game_segment_length': 200,
 
296
  - **Demo:** [video](https://huggingface.co/OpenDILabCommunity/LunarLander-v2-MuZero/blob/main/replay.mp4)
297
  <!-- Provide the size information for the model. -->
298
  - **Parameters total size:** 15479.39 KB
299
+ - **Last Update Date:** 2023-12-21
300
 
301
  ## Environments
302
  <!-- Address questions around what environment the model is intended to be trained and deployed at, including the necessary information needed to be provided for future users. -->
303
  - **Benchmark:** OpenAI/Gym/Box2d
304
  - **Task:** LunarLander-v2
305
  - **Gym version:** 0.25.1
306
+ - **DI-engine version:** v0.5.0
307
+ - **PyTorch version:** 2.0.1+cu117
308
  - **Doc**: [Environments link](<TODO>)