facebook/esm2_t6_8M_UR50D · inferring device map for model

I am trying to load the pre-trained model using device_map="auto", but I get an error saying that:

...
File ~/anaconda3/envs/esm_2/lib/python3.9/site-packages/transformers/modeling_utils.py:2406, in PreTrainedModel.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
   2404 # Dispatch model with hooks on all devices if necessary
   2405 if device_map is not None:
-> 2406     dispatch_model(model, device_map=device_map, offload_dir=offload_folder, offload_index=offload_index)
   2408 if output_loading_info:
   2409     if loading_info is None:

TypeError: dispatch_model() got an unexpected keyword argument 'offload_index'

Looks likes indeed the function dispatch_model does not have an 'offload_index'. I tried removing offload_index from the function call, it crashed my server 😂.
Also, It seems that I cannot even generate a device map for the Facebook ESM model. Trying the following:

from accelerate import init_empty_weights
from transformers import AutoConfig, AutoModel, AutoTokenizer

config = AutoConfig.from_pretrained("facebook/esm2_t6_8M_UR50D")

with init_empty_weights():
    model = AutoModel.from_config(config)
device_map = infer_auto_device_map(model)
divice_map

returns

{'': 0}

While running a small model like facebook/esm2_t6_8M_UR50D is not an issue, I am afraid that the larger model (3B or 15B) will not be useable unless one can split the weights across GPUs. Any thoughts about the issue above would be greatly appreciated.

Thank you!