File size: 2,836 Bytes
e32b9e3
 
 
 
a0744ae
e32b9e3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e1cac1f
e32b9e3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e1cac1f
e32b9e3
 
 
 
 
 
 
 
 
e1cac1f
e32b9e3
 
 
e1cac1f
 
e32b9e3
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
---
base_model:
- Ultralytics/YOLOv8
pipeline_tag: image-segmentation
license: agpl-3.0
---

## Text line detection from Finnish 19th century Court Records

The model is trained to find text lines from digitized 19th century court record documents.
The model has been trained using yolov8x-seg by Ultralytics as the base model.


## Intended uses & limitations

<img src='text_line_example.jpg' width='500'>

Most of the training data consist of handwritten documents, but the model appears to generalize quite well also to typeset data.

## Training data

Training dataset consisted of 4615 digitized and annotated 19th century court record documents, while validation
dataset contained 574 annotated document images.

## Training procedure

This model was trained using 2 NVIDIA RTX A6000 GPUs with the following hyperparameters:

- image size: 640
- learning rate (lr0): 0.05
- train batch size: 32
- epochs: 100
- patience: 10 epochs
- optimizer: SGD
- scheduler: cosine learning rate scheduler (cos_lr=True)
- workers: 4

Default settings were used for other training hyperparameters (find more information [here](https://docs.ultralytics.com/modes/train/#train-settings)).

Model training was performed using the following code:

```python
from ultralytics import YOLO

# Use pretrained Yolo segmentation model
model = YOLO('yolov8x-seg.pt')                                                                                                               

# Path to .yaml file where data location and object classes are defined
yaml_path = 'text_lines.yaml'

# Start model training with the defined parameters
model.train(data=yaml_path, name='model_name', epochs=100, imgsz=640, workers=4, optimizer='SGD', lr0=0.05, seed=551, val=True, cos_lr=True, patience=10, batch=32, device=[0,1])
```

## Evaluation results

Evaluation results using the validation dataset are listed below:
|Class|Images|Class instances|Box precision|Box recall|Box mAP50|Box mAP50-95|Mask precision|Mask recall|Mask mAP50|Mask mAP50-95
|:----|:----|:----|:----|:----|:----|:----|:----|:----|:----|:----|
Text line|574|43156|0.912|0.888|0.949|0.701|0.935|0.907|0.954|0.55

More information on the performance metrics can be found [here](https://docs.ultralytics.com/guides/yolo-performance-metrics/).

## Inference

If the model file `tuomiokirja_lines_05122023.pt` is downloaded to a folder `\models\tuomiokirja_lines_05122023.pt`
and the input image path is `\data\image.jpg', inference can be perfomed using the following code:

```python
from ultralytics import YOLO

# Initialize model
model = YOLO('\models\tuomiokirja_lines_05122023.pt')
prediction_results = model.predict(source='\data\image.jpg', save=True)
```
More information for available inference arguments can be found [here](https://docs.ultralytics.com/modes/predict/#inference-arguments).