Edit model card
YAML Metadata Error: "datasets[1]" with value "Custom Rosetta" is not valid. If possible, use a dataset id from https://hf.co/datasets.
YAML Metadata Error: "language" with value "protein" is not valid. It must be an ISO 639-1, 639-2 or 639-3 code (two/three letters), or a special value like "code", "multilingual". If you want to use BCP-47 identifiers, you can specify them in language_bcp47.

ProtBert-BFD finetuned on Rosetta 20AA dataset

This model is finetuned to predict Rosetta fold energy using a dataset of 100k 20AA sequences.

Current model in this repo: prot_bert_bfd-finetuned-032722_1752

Performance

  • 20AA sequences (1k eval set):
    Metrics: 'mae': 0.090115, 'r2': 0.991208, 'mse': 0.013034, 'rmse': 0.114165

  • 40AA sequences (10k eval set):
    Metrics: 'mae': 0.537456, 'r2': 0.659122, 'mse': 0.448607, 'rmse': 0.669781

  • 60AA sequences (10k eval set):
    Metrics: 'mae': 0.629267, 'r2': 0.506747, 'mse': 0.622476, 'rmse': 0.788972

prot_bert_bfd from ProtTrans

The starting pretrained model is from ProtTrans, trained on 2.1 billion proteins from BFD. It was trained on protein sequences using a masked language modeling (MLM) objective. It was introduced in this paper and first released in this repository.

Created by Ladislav Rampasek

Downloads last month
8
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.