togethercomputer/GPT-NeoXT-Chat-Base-20B · Can you help me fine-tune this with LoRA? (Having an error)

The following code freezes the layers of the model (a crucial step for LoRA fine tuning)

for param in model.parameters():
    param.requires_grad = False  
    if param.ndim == 1:
        param.data = param.data.to(torch.float32)

model.gradient_checkpointing_enable()  
model.enable_input_require_grads()

class CastOutputToFloat(nn.Sequential):
    def forward(self, x): return super().forward(x).to(torch.float32)

model.lm_head = CastOutputToFloat(model.lm_head) ### This line gives error

The error is:

AttributeError: 'GPTNeoXForCausalLM' object has no attribute 'lm_head'

Please help,
Thanks 🤗