|
--- |
|
library_name: pytorch |
|
pipeline_tag: text2text-generation |
|
language: |
|
- vi |
|
- lo |
|
metrics: |
|
- bleu |
|
--- |
|
|
|
## Direct Use |
|
|
|
Please use python version 3.10 |
|
|
|
### Load a pre-trained model |
|
|
|
Use `load_config` to load a .yaml config file. |
|
|
|
Then use `load_model_tokenizer` to load a pretrained model and its tokenizers |
|
``` |
|
from config import load_config |
|
from load_model import load_model_tokenizer |
|
|
|
config = load_config(file_name='config/config_final.yaml') |
|
model, src_tokenizer, tgt_tokenizer = load_model_tokenizer(config) |
|
``` |
|
|
|
### Translate lo to vi |
|
Use the `translate` function in `translate.py`. |
|
``` |
|
from translate import translate |
|
from config import load_config |
|
from load_model import load_model_tokenizer |
|
|
|
config = load_config(file_name='config/config_final.yaml') |
|
model, src_tokenizer, tgt_tokenizer = load_model_tokenizer(config) |
|
|
|
text = " " |
|
translation, attn = translate( |
|
model, src_tokenizer, tgt_tokenizer, text, |
|
decode_method='beam-search', |
|
) |
|
print(translation) |
|
``` |
|
|
|
## Training |
|
Use the `train_model` function in `train.py` to train your model. |
|
``` |
|
from train import train_model |
|
from config import load_config |
|
|
|
config = load_config(file_name='config/config_final.yaml') |
|
train_model(config) |
|
``` |
|
|
|
If you wish to continue training/ fine-tune our model, you should |
|
modify the `num_epochs` in your desired config file, |
|
as well as read the following notes (`+` is the string concat funtion): |
|
- The code will save and preload models in `model_folder` |
|
- The code will preload the model with the name: "`model_basename` + `preload` + `.pt`" |
|
- The code will NOT preload a trained model if you set `preload` as `null` |
|
- Every epoch, the code will save the model with the name: "`model_basename` + `_` + (current epoch) + `.pt`" |
|
- `train_model` will automatically continue training the `preload`ed model. |