Semantic Specialization for Knowledge-based Word Sense Disambiguation
- This repository contains the trained model (projection heads) and sense/context embeddings used for training and evaluating the model.
- If you want to learn how to use these files, please refer to the semantic_specialization_for_wsd repository.
Trained Model (Projection Heads)
- File: checkpoints/baseline/last.ckpt
- This is one of the trained models used for reporting the main results (Table 2 in [Mizuki and Okazaki, EACL2023]).
NOTE: Five runs were performed in total.
- The main hyperparameters used for training are as follows:
Argument name |
Value |
Description |
max_epochs |
15 |
Maximum number of training epochs |
cfg_similarity_class.temperature ($\beta^{-1}$) |
0.015625 (=1/64) |
Temperature parameter for the contrastive loss |
batch_size ($N_B$) |
256 |
Number of samples in each batch for the attract-repel and self-training objectives |
coef_max_pool_margin_loss ($\alpha$) |
0.2 |
Coefficient for the self-training loss |
cfg_gloss_projection_head.n_layer |
2 |
Number of FFNN layers for the projection heads |
cfg_gloss_projection_head.max_l2_norm_ratio ($\epsilon$) |
0.015 |
Hyperparameter for the distance constraint integrated in the projection heads |
Sense/context embeddings
- Directory:
data/bert_embeddings/
- Sense embeddings:
bert-large-cased_WordNet_Gloss_Corpus.hdf5
- Context embeddings for the self-training objective:
bert-large-cased_SemCor.hdf5
- Context embeddings for evaluating the WSD task:
bert-large-cased_WSDEval-ALL.hdf5
Reference
@inproceedings{Mizuki:EACL2023,
title = "Semantic Specialization for Knowledge-based Word Sense Disambiguation",
author = "Mizuki, Sakae and Okazaki, Naoaki",
booktitle = "Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume",
series = {EACL},
month = may,
year = "2023",
address = "Dubrovnik, Croatia",
publisher = "Association for Computational Linguistics",
pages = "3449--3462",
}