hazemessam/esm3_ddg_v2 · Hugging Face

Model Details

Model Description

This model was part of the Evolutionary Scale BioML Hackathon.

Uses

Used for ddG prediction for single mutation.

How to Get Started with the Model

# Make sure `esm` is installed, if not use: `pip install esm`
from transformers import AutoModel
from esm.tokenization.sequence_tokenizer import EsmSequenceTokenizer
import torch

model = AutoModel.from_pretrained("hazemessam/esm3_ddg_v2", trust_remote_code=True)
tokenizer = EsmSequenceTokenizer()
model.eval()

with torch.no_grad():
    output = model(tokenized_seq1, tokenized_seq2, positions=mutation_position)

Training Details

Training Data

Training Data: https://huggingface.co/datasets/hazemessam/ddg/blob/main/S2648.csv

Training Procedure

The results listed below are the best results for each evaluation dataset, but this checkpoint is the best checkpoint based on Ssym evaluation dataset

Training Hyperparameters

Scheduler: Cosine
Warmup steps: 400
Seed: 7
Gradient accumulation steps: 16
Batch size: 1
DoRA rank: 16
DoRA alpha: 32
Updated Layers: ["layernorm_qkv.1", "ffn.1", "ffn.3"]
DoRA bias: "none"

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

The model was evaluated on the following:

Results

Ssym pearson correlation: 0.85 Ssym RMSE: 0.83

Ssym_r pearson correlation: 0.85 Ssym_r RMSE: 0.83

Myoglobin pearson correlation: 0.65 Myoglobin RMSE: 0.83

Myoglobin_r pearson correlation: 0.65 Myoglobin_r RMSE: 0.84