Regression Model for Exercise Tolerance Functioning Levels (ICF b455)

Description

A fine-tuned regression model that assigns a functioning level to Dutch sentences describing exercise tolerance functions. The model is based on a pre-trained Dutch medical language model (link to be added): a RoBERTa model, trained from scratch on clinical notes of the Amsterdam UMC. To detect sentences about exercise tolerance functions in clinical text in Dutch, use the icf-domains classification model.

Functioning levels

Level	Meaning
5	MET>6. Can tolerate jogging, hard exercises, running, climbing stairs fast, sports.
4	4≤MET≤6. Can tolerate walking / cycling at a brisk pace, considerable effort (e.g. cycling from 16 km/h), heavy housework.
3	3≤MET<4. Can tolerate walking / cycling at a normal pace, gardening, exercises without equipment.
2	2≤MET<3. Can tolerate walking at a slow to moderate pace, grocery shopping, light housework.
1	1≤MET<2. Can tolerate sitting activities.
0	0≤MET<1. Can physically tolerate only recumbent activities.

The predictions generated by the model might sometimes be outside of the scale (e.g. 5.2); this is normal in a regression model.

Intended uses and limitations

The model was fine-tuned (trained, validated and tested) on medical records from the Amsterdam UMC (the two academic medical centers of Amsterdam). It might perform differently on text from a different hospital or text from non-hospital sources (e.g. GP records).
The model was fine-tuned with the Simple Transformers library. This library is based on Transformers but the model cannot be used directly with Transformers pipeline and classes; doing so would generate incorrect outputs. For this reason, the API on this page is disabled.

How to use

To generate predictions with the model, use the Simple Transformers library:

from simpletransformers.classification import ClassificationModel

model = ClassificationModel(
    'roberta',
    'CLTL/icf-levels-ins',
    use_cuda=False,
)

example = 'kan nog goed traplopen, maar flink ingeleverd aan conditie na Corona'
_, raw_outputs = model.predict([example])
predictions = np.squeeze(raw_outputs)

The prediction on the example is:

3.13

The raw outputs look like this:

[[3.1300993]]

Training data

The training data consists of clinical notes from medical records (in Dutch) of the Amsterdam UMC. Due to privacy constraints, the data cannot be released.
The annotation guidelines used for the project can be found here.

Training procedure

The default training parameters of Simple Transformers were used, including:

Optimizer: AdamW
Learning rate: 4e-5
Num train epochs: 1
Train batch size: 8

Evaluation results

The evaluation is done on a sentence-level (the classification unit) and on a note-level (the aggregated unit which is meaningful for the healthcare professionals).

	Sentence-level	Note-level
mean absolute error	0.69	0.61
mean squared error	0.80	0.64
root mean squared error	0.89	0.80

CLTL
/

icf-levels-ins