|
--- |
|
pipeline_tag: image-classification |
|
license: mit |
|
--- |
|
## Table cell classification |
|
|
|
The model is trained to classify table cell images as either empty or not empty. It has been trained using |
|
table cell images from Finnish census and death record tables from the 1930s. |
|
|
|
The model has been trained using [densenet121](https://pytorch.org/vision/stable/models/generated/torchvision.models.densenet121.html) as the base model, |
|
and it has been transformed into the [onnx](https://onnx.ai/) format. |
|
|
|
|
|
## Intended uses & limitations |
|
|
|
The model has been trained to classify table cells from specific kinds of tables, which contain mainly handwritten text. |
|
It has not been tested with other type of table cell data. |
|
|
|
## Training and validation data |
|
|
|
Training dataset consisted of |
|
|
|
- empty cell images: 2943 |
|
- non-empty cell images: 5033 |
|
|
|
Validation dataset consisted of |
|
|
|
- empty cell images: 367 |
|
- non-empty cell images: 627 |
|
|
|
## Training procedure |
|
|
|
The code used for model training is available in the repository in `train.py` file, which uses functions from |
|
`augment.py` and `utils.py` files. The required libraries are listed in the `requirements.txt` file. |
|
|
|
The model was trained using cpu with the following hyperparameters: |
|
|
|
- image size: 2560 |
|
- learning rate: 0.0001 |
|
- train batch size: 32 |
|
- epochs: 15 |
|
- patience: 3 epochs |
|
- optimizer: Adam |
|
|
|
## Evaluation results |
|
|
|
Evaluation results using the validation dataset are listed below: |
|
|Validation loss|Validation accuracy|Validation F1-score |
|
|:----|:----|:----| |
|
0.0427|0.9899|0.9903 |
|
|
|
## Inference |
|
Inference can be performed using the code in the `test.py` file. |