Edit model card

Model Details

Model Description

This model was part of the Evolutionary Scale BioML Hackathon.

Uses

Used for ddG prediction for single mutation.

How to Get Started with the Model

# Make sure `esm` is installed, if not use: `pip install esm`
from transformers import AutoModel
from esm.tokenization.sequence_tokenizer import EsmSequenceTokenizer
import torch

model = AutoModel.from_pretrained("hazemessam/esm3_ddg_v2", trust_remote_code=True)
tokenizer = EsmSequenceTokenizer()
model.eval()

with torch.no_grad():
    output = model(tokenized_seq1, tokenized_seq2, positions=mutation_position)

Training Details

Training Data

Training Data: https://huggingface.co/datasets/hazemessam/ddg/blob/main/S2648.csv

Training Procedure

The results listed below are the best results for each evaluation dataset, but this checkpoint is the best checkpoint based on Ssym evaluation dataset

Training Hyperparameters

  • Scheduler: Cosine
  • Warmup steps: 400
  • Seed: 7
  • Gradient accumulation steps: 16
  • Batch size: 1
  • DoRA rank: 16
  • DoRA alpha: 32
  • Updated Layers: ["layernorm_qkv.1", "ffn.1", "ffn.3"]
  • DoRA bias: "none"

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

The model was evaluated on the following:

Results

Ssym pearson correlation: 0.85 Ssym RMSE: 0.83

Ssym_r pearson correlation: 0.85 Ssym_r RMSE: 0.83

Myoglobin pearson correlation: 0.65 Myoglobin RMSE: 0.83

Myoglobin_r pearson correlation: 0.65 Myoglobin_r RMSE: 0.84

Downloads last month
47
Safetensors
Model size
1.4B params
Tensor type
F32
·
Inference Examples
Inference API (serverless) does not yet support model repos that contain custom code.