File size: 595 Bytes
41ef15a 7125e4b 41ef15a d39159a f9403e3 3c1b667 7125e4b f9403e3 7125e4b 06776d0 7125e4b f9403e3 06776d0 7125e4b d39159a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
---
license: mit
tags:
- mamba
- pytorch
- Test Generation
- research abstract
datasets: pt-sk/research_papers_short
metrics: CrossEntropyLoss
---
This model uses Mamba Architecture trained on a research abstract dataset.
* Optimizer: AdamW
* Leanring Rate: 0.001
Import the scripts from the code folder
```
from model import Mamba, ModelArgs
```
Loading Model
```
mamba_model = Mamba.from_pretrained("pt-sk/mamba").to("cuda")
```
Loading Tokenizer
```
tokenizer = AutoTokenizer.from_pretrained('pt-sk/mamba')
```
mamba_reserach file contains the state dict of optimizer and the model. |