agemagician
commited on
Commit
•
286cb6e
1
Parent(s):
00a97b6
Update README.md
Browse files
README.md
CHANGED
@@ -18,7 +18,7 @@ Pretrained model on protein sequences using a masked language modeling (MLM) obj
|
|
18 |
|
19 |
## Model description
|
20 |
|
21 |
-
|
22 |
This means it was pretrained on the raw protein sequences only, with no humans labelling them in any way (which is why it can use lots of
|
23 |
publicly available data) with an automatic process to generate inputs and labels from those protein sequences.
|
24 |
|
@@ -82,7 +82,7 @@ The details of the masking procedure for each sequence are as follows:
|
|
82 |
|
83 |
### Pretraining
|
84 |
|
85 |
-
The model was trained on a single TPU Pod
|
86 |
It was trained using ANKH-Large model as an initial checkpoint, rather than training from scratch.
|
87 |
It has a total of approximately 2B parameters and was trained using the encoder-decoder architecture.
|
88 |
The optimizer used is Adafactor with linear warmup with linear decay learning rate schedule for pre-training.
|
|
|
18 |
|
19 |
## Model description
|
20 |
|
21 |
+
Ankh2-ext1 is based on the `ANKH-Large` model and was pretrained on a large corpus of protein sequences in a self-supervised fashion.
|
22 |
This means it was pretrained on the raw protein sequences only, with no humans labelling them in any way (which is why it can use lots of
|
23 |
publicly available data) with an automatic process to generate inputs and labels from those protein sequences.
|
24 |
|
|
|
82 |
|
83 |
### Pretraining
|
84 |
|
85 |
+
The model was trained on a single TPU Pod V5-lite for 45 epochs in total, using sequence length 512 (batch size 1k).
|
86 |
It was trained using ANKH-Large model as an initial checkpoint, rather than training from scratch.
|
87 |
It has a total of approximately 2B parameters and was trained using the encoder-decoder architecture.
|
88 |
The optimizer used is Adafactor with linear warmup with linear decay learning rate schedule for pre-training.
|