Update README.md
Browse files
README.md
CHANGED
@@ -4,13 +4,13 @@ library_name: transformers
|
|
4 |
datasets:
|
5 |
- CCDS
|
6 |
- Ensembl
|
7 |
-
pipeline_tag:
|
8 |
tags:
|
9 |
- protein language model
|
10 |
- biology
|
11 |
widget:
|
12 |
-
- text: ( Z E V L P Y G D E K L S P
|
13 |
-
example_title:
|
14 |
---
|
15 |
|
16 |
# cdsBERT
|
@@ -18,7 +18,7 @@ widget:
|
|
18 |
|
19 |
## Model description
|
20 |
|
21 |
-
|
22 |
|
23 |
## How to use
|
24 |
|
@@ -46,10 +46,10 @@ vector_embedding = matrix_embedding.mean(dim=0)
|
|
46 |
```
|
47 |
|
48 |
## Intended use and limitations
|
49 |
-
|
50 |
|
51 |
## Our lab
|
52 |
-
The [Gleghorn lab](https://www.gleghornlab.com/) is an
|
53 |
|
54 |
## Please cite
|
55 |
Coming soon!
|
|
|
4 |
datasets:
|
5 |
- CCDS
|
6 |
- Ensembl
|
7 |
+
pipeline_tag: fill-mask
|
8 |
tags:
|
9 |
- protein language model
|
10 |
- biology
|
11 |
widget:
|
12 |
+
- text: ( Z E V L P Y G D E K L S P [MASK] G D G G D V G Q I F s C L Q ]
|
13 |
+
example_title: Fill codon mask (Y)
|
14 |
---
|
15 |
|
16 |
# cdsBERT
|
|
|
18 |
|
19 |
## Model description
|
20 |
|
21 |
+
cdsBERT is pLM with a codon vocabulary that was seeded with [ProtBERT](https://huggingface.co/Rostlab/prot_bert_bfd) and trained with a novel vocabulary extension pipeline called MELD. cdsBERT offers a highly biologically relevant latent space with excellent EC number prediction surpassing ProtBERT.
|
22 |
|
23 |
## How to use
|
24 |
|
|
|
46 |
```
|
47 |
|
48 |
## Intended use and limitations
|
49 |
+
cdsBERT serves as a general purpose
|
50 |
|
51 |
## Our lab
|
52 |
+
The [Gleghorn lab](https://www.gleghornlab.com/) is an interdiciplinary research group at the University of Delaware that focuses on solving translational problems with our expertise in engineering, biology, and chemistry. We develop inexpensive and reliable tools to study organ development, maternal-fetal health, and drug delivery. Recently we have begun exploration into protein language models and strive to make protein design and annotation accessible.
|
53 |
|
54 |
## Please cite
|
55 |
Coming soon!
|