--- license: mit language: - en metrics: - accuracy - mse - f1 base_model: - dmis-lab/biobert-base-cased-v1.2 - google-bert/bert-base-cased pipeline_tag: text-classification model-index: - name: bert-causation-rating-dr2 results: - task: type: text-classification dataset: name: rating_dr2 type: dataset metrics: - name: log-based ordinal loss with distance power 3.0 type: loss value: 0.004970851354300976 - name: off by 1 accuracy type: accuracy value: 100.00 - name: mean squared error for ordinal data type: mse value: 0.000 - name: weighted F1 score type: f1 value: 1.000 - name: Kendall's tau coefficient type: Kendall's tau value: 1.000 source: name: Keling Wang url: https://github.com/Keling-Wang datasets: - kelingwang/causation_strength_rating --- # Model description This `bert-causation-rating-dr2` model is a fine-tuned [biobert-base-cased-v1.2](https://huggingface.co/dmis-lab/biobert-base-cased-v1.2) model on a small set of manually annotated texts with causation labels. This model is tasked with classifying a sentence into different levels of strength of causation expressed in this sentence. Before tuning on this dataset, the `biobert-base-cased-v1.2` model is fine-tuned on a dataset containing causation labels from a published paper. This model starts from pre-trained [`kelingwang/bert-causation-rating-pubmed`](https://huggingface.co/kelingwang/bert-causation-rating-pubmed). For more information please view the link and my [GitHub page](https://github.com/Keling-Wang/causation_rating). The sentences in the dataset were rated independently by two researchers. This `dr2` version is tuned on the set of sentences with labels rated by Rater 2 and 3. # Intended use and limitations This model is primarily used to rate for the strength of expressed causation in a sentence extracted from a clinical guideline in the field of diabetes mellitus management. This model predicts strength of causation (SoC) labels based on the text inputs as: * -1: No correlation or variable relationships mentioned in the sentence. * 0: There is correlational relationships but not causation in the sentence. * 1: The sentence expresses weak causation. * 2: The sentence expresses moderate causation. * 3: The sentence expresses strong causation. *NOTE:* The model output is five one-hot logits and will be 0-index based, and the labels will be 0 to 4. It is good to use [this `python` module](https://github.com/Keling-Wang/causation_rating/blob/main/tests/prediction_from_pretrained.py) if one wants to make predictions. # Performance and hyperparameters ## Test metrics This model achieves the following results on the test dataset. The test dataset is a 25% held-out stratified split of the entire dataset with `SEED=114514`. * Loss: 0.0049709 * Off-by-1 accuracy: 100.0000 * Off-by-2 accuracy: 100.0000 * MSE for ordinal data: 0.0000 * Weighted F1: 1.0000 * Kendall's Tau: 1.0000 ## Hyperparameter tuning metrics This model achieves the following averaged results during 4-fold cross-validation with best hyperparameters in hyperparameter tuning process: * Loss: 0.519251 * Off-by-1 accuracy: 98.3803 * Off-by-2 accuracy: 99.8944 * MSE for ordinal data: 0.02359 * Weighted F1: 0.9837 * Kendall's Tau: 0.9901 This performance is achieved with the following hyperparameters: * Learning rate: 7.96862e-05 * Weight decay: 0.148775 * Warmup ratio: 0.460611 * Power of polynomial learning rate scheduler: 1.129829 * Power to the distance measure used in the loss function \alpha: 3.0 # Training settings The following training configurations apply: * Pre-trained model: `kelingwang/bert-causation-rating-pubmed` * `seed`: 114514 * `batch_size`: 128 * `epoch`: 8 * `max_length` in `torch.utils.data.Dataset`: 128 * Loss function: the [OLL loss](https://aclanthology.org/2022.coling-1.407/) with a tunable hyperparameter \alpha (Power to the distance measure used in the loss function). * `lr`: 7.96862e-05 * `weight_decay`: 0.148775 * `warmup_ratio`: 0.460611 * `lr_scheduler_type`: polynomial * `lr_scheduler_kwargs`: `{"power": 1.129829, "lr_end": 1e-8}` * Power to the distance measure used in the loss function \alpha: 3.0 # Framework versions and devices This model is run on a NVIDIA P100 CPU provided by Kaggle. Framework versions are: * python==3.10.14 * cuda==12.4 * NVIDIA-SMI==550.90.07 * torch=2.4.0 * transformers==4.45.1 * scikit-learn==1.2.2 * optuna==4.0.0 * nlpaug==1.1.11