MoCapAct Model Zoo
Control of simulated humanoid characters is a challenging benchmark for sequential decision-making methods, as it assesses a policyβs ability to drive an inherently unstable, discontinuous, and high-dimensional physical system. Motion capture (MoCap) data can be very helpful in learning sophisticated locomotion policies by teaching a humanoid agent low-level skills (e.g., standing, walking, and running) that can then be used to generate high-level behaviors. However, even with MoCap data, controlling simulated humanoids remains very hard, because this data offers only kinematic information. Finding physical control inputs to realize the MoCap-demonstrated motions has required methods like reinforcement learning that need large amounts of compute, which has effectively served as a barrier to entry for this exciting research direction.
In an effort to broaden participation and facilitate evaluation of ideas in humanoid locomotion research, we are releasing MoCapAct (Motion Capture with Actions), a library of high-quality pre-trained agents that can track over three hours of MoCap data for a simulated humanoid in the dm_control
physics-based environment and rollouts from these experts containing proprioceptive observations and actions. MoCapAct allows researchers to sidestep the computationally intensive task of training low-level control policies from MoCap data and instead use MoCapAct's expert agents and demonstrations for learning advanced locomotion behaviors. It also allows improving on our low-level policies by using them and their demonstration data as a starting point.
In our work, we use MoCapAct to train a single hierarchical policy capable of tracking the entire MoCap dataset within dm_control
.
We then re-use the learned low-level component to efficiently learn other high-level tasks.
Finally, we use MoCapAct to train an autoregressive GPT model and show that it can perform natural motion completion given a motion prompt.
We encourage the reader to visit our project website to see videos of our results as well as get links to our paper and code.
Model Zoo Structure
The file structure of the model zoo is:
βββ all
β βββ experts
β βββ experts_1.tar.gz
β βββ experts_2.tar.gz
β ...
β βββ experts_8.tar.gz
β
βββ sample
β βββ experts.tar.gz
β
βββ multiclip_policy.tar.gz
β βββ full_dataset
β βββ locomotion_dataset
β
βββ transfer.tar.gz
β βββ go_to_target
β β βββ general_low_level
β β βββ locomotion_low_level
β β βββ no_low_level
β β
β βββ velocity_control
β βββ general_low_level
β βββ locomotion_low_level
β βββ no_low_level
β
βββ gpt.ckpt
β
βββ videos
βββ full_clip_videos.tar.gz
βββ snippet_videos.tar.gz
Experts Tarball Files
The expert tarball files have the following structure:
all/experts/experts_*.tar.gz
: Contains all of the clip snippet experts. Due to file size limitations, we split the experts among multiple tarball files.sample/experts.tar.gz
: Contains the clip snippet experts used to run the examples on the dataset website.
The expert structure is detailed in Appendix A.1 of the paper as well as https://github.com/microsoft/MoCapAct#description.
An expert can be loaded and rolled out in Python as in the following example:
from mocapact import observables
from mocapact.sb3 import utils
expert_path = "/path/to/experts/CMU_083_33/CMU_083_33-0-194/eval_rsi/model"
expert = utils.load_policy(expert_path, observables.TIME_INDEX_OBSERVABLES)
from mocapact.envs import tracking
from dm_control.locomotion.tasks.reference_pose import types
dataset = types.ClipCollection(ids=['CMU_083_33'], start_steps=[0], end_steps=[194])
env = tracking.MocapTrackingGymEnv(dataset)
obs, done = env.reset(), False
while not done:
action, _ = expert.predict(obs, deterministic=True)
obs, rew, done, _ = env.step(action)
print(rew)
Alternatively, an expert can be rolled out from the command line:
python -m mocapact.clip_expert.evaluate \
--policy_root /path/to/experts/CMU_016_22/CMU_016_22-0-82/eval_rsi/model \
--act_noise 0 \
--ghost_offset 1 \
--always_init_at_clip_start
GPT
The GPT policy is contained in gpt.ckpt
and can be loaded using PyTorch Lightning:
from mocapact.distillation import model
policy = model.GPTPolicy.load_from_checkpoint('/path/to/gpt.ckpt', map_location='cpu')
This policy can be used with mocapact/distillation/motion_completion.py
, as in the following example:
python -m mocapact.distillation.motion_completion.py \
--policy_path /path/to/gpt.ckpt \
--nodeterministic \
--ghost_offset 1 \
--expert_root /path/to/experts/CMU_016_25 \
--max_steps 500 \
--always_init_at_clip_start \
--prompt_length 32 \
--min_steps 32 \
--device cuda \
--clip_snippet CMU_016_25
Multi-Clip Policy
The multiclip_policy.tar.gz
file contains two policies:
full_dataset
: Trained on the entire MoCapAct datasetlocomotion_dataset
: Trained on thelocomotion_small
portion of the MoCapAct dataset
Taking full_dataset
as an example, a multi-clip policy can be loaded using PyTorch Lightning:
from mocapact.distillation import model
policy = model.NpmpPolicy.load_from_checkpoint('/path/to/multiclip_policy/full_dataset/model/model.ckpt', map_location='cpu')
The policy can be used with mocapact/distillation/evaluate.py
, as in the following example:
python -m mocapact.distillation.evaluate \
--policy_path /path/to/multiclip_policy/full_dataset/model/model.ckpt \
--act_noise 0 \
--ghost_offset 1 \
--always_init_at_clip_start \
--termination_error_threshold 10 \
--clip_snippets CMU_016_22
Transfer
The transfer.tar.gz
file contains policies for downstream tasks. The main difference between the contained folders is what low-level policy is used:
general_low_level
: Low-level policy comes frommulticlip_policy/full_dataset
locomotion_low_level
: Low-level policy comes frommulticlip_policy/locomotion_dataset
no_low_level
: No low-level policy used
The policy structure is as follows:
βββ best_model.zip
βββ low_level_policy.ckpt
βββ vecnormalize.pkl
The low_level_policy.ckpt
(only present in general_low_level
and locomotion_low_level
) contains the low-level policy and is loaded with PyTorch Lightning.
The best_model.zip
file contains the task policy parameters.
The vecnormalize.pkl
file contains the observation normalizer.
The latter two files are loaded with Stable-Baselines3.
The policy can be used with mocapact/transfer/evaluate.py
, as in the following example:
python -m mocapact.transfer.evaluate \
--model_root /path/to/transfer/go_to_target/general_low_level \
--task /path/to/mocapact/transfer/config.py:go_to_target
MoCap Videos
There are two tarball files containing videos of the MoCap clips in the dataset:
full_clip_videos.tar.gz
contains videos of the full MoCap clips.snippet_videos.tar.gz
contains videos of the snippets that were used to train the experts. Note that they are playbacks of the clips themselves, not rollouts of the corresponding experts.