DictaBERT
Collection
Collection of state-of-the-art language model for Hebrew, finetuned for various tasks, as detailed in the article: https://arxiv.org/abs/2308.16687
โข
17 items
โข
Updated
State-of-the-art language model for Hebrew, released here.
This is the fine-tuned BERT-large model for the question-answering task using the HeQ dataset.
For the bert-base models for other tasks, see here.
Sample usage:
from transformers import pipeline
oracle = pipeline('question-answering', model='dicta-il/dictabert-large-heq')
context = 'ืื ืืืช ืคืจืืคืืืื ืฉื ืืฉืชืืฉืื ื ืืฉืืช ืขื ืืื ืจืืื ืืืืื ืคืืื ืฆืืืื ืขื ืืคืจืืืืช. ืืกืืื ืื ืืืืืื ืืืง ืืืืืื ืืช ืืืืฆืขืืช ืืงืืงื ืืช ืืืืืข ืฉื ืืชื ืืืฉืื ืืืืฆืขืืช ืขืืืืืช ืืืช ืืืคื ืืฉืืืืฉ ืืขืืืืืช. ืืจืฆืืช ืืืจืืช, ืืืฉื, ืงืืขื ืืืงืื ื ืืงืฉืื ืืื ืื ืืืข ืืืฆืืจืช ืขืืืืืช ืืืฉืืช. ืืืงืื ืืื, ืืฉืจ ื ืงืืขื ืืฉื ืช 2000, ื ืงืืขื ืืืืจ ืฉื ืืฉืฃ ืื ืืืฉืจื ืืืืฉืื ืืืืื ืืืช ืฉื ืืืืฉื ืืืืจืืงืื ื ืื ืืฉืืืืฉ ืืกืืื (ONDCP) ืืืืช ืืืื ืืฉืชืืฉ ืืขืืืืืช ืืื ืืขืงืื ืืืจื ืืฉืชืืฉืื ืฉืฆืคื ืืคืจืกืืืืช ื ืื ืืฉืืืืฉ ืืกืืื ืืืืจื ืืืืืง ืืื ืืฉืชืืฉืื ืืื ื ืื ืกื ืืืชืจืื ืืชืืืืื ืืฉืืืืฉ ืืกืืื. ืื ืืื ืืจืื ื, ืคืขืื ืืืืื ืืคืจืืืืช ืืืฉืชืืฉืื ืืืื ืืจื ื, ืืฉืฃ ืื ื-CIA ืฉืื ืขืืืืืช ืงืืืขืืช ืืืืฉืื ืืืจืืื ืืืฉื ืขืฉืจ ืฉื ืื. ื-25 ืืืฆืืืจ 2005 ืืืื ืืจืื ื ืื ืืกืืื ืืช ืืืืืืื ืืืืื (ื-NSA) ืืฉืืืจื ืฉืชื ืขืืืืืช ืงืืืขืืช ืืืืฉืื ืืืงืจืื ืืืื ืฉืืจืื ืชืืื ื. ืืืืจ ืฉืื ืืฉื ืคืืจืกื, ืื ืืืืื ืืื ืืช ืืฉืืืืฉ ืืื.'
question = 'ืืืฆื ืืืืื ืืืืืข ืฉื ืืชื ืืืฉืื ืืืืฆืขืืช ืืขืืืืืช?'
oracle(question=question, context=context)
Output:
{
"score": 0.9999945163726807,
"start": 101,
"end": 114,
"answer": "ืืืืฆืขืืช ืืงืืงื"
}
If you use DictaBERT in your research, please cite DictaBERT: A State-of-the-Art BERT Suite for Modern Hebrew
BibTeX:
@misc{shmidman2023dictabert,
title={DictaBERT: A State-of-the-Art BERT Suite for Modern Hebrew},
author={Shaltiel Shmidman and Avi Shmidman and Moshe Koppel},
year={2023},
eprint={2308.16687},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
This work is licensed under a Creative Commons Attribution 4.0 International License.