Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

scibert-wechsel-korean

Scibert(๐Ÿ‡บ๐Ÿ‡ธ) converted into Korean(๐Ÿ‡ฐ๐Ÿ‡ท) using WECHSEL technique.

Description

  • SciBERT is trained on papers from the corpus of semanticscholar.org. Corpus size is 1.14M papers, 3.1B tokens.
  • Wechsel is converting embedding layer's subword tokens from source language to target language.
  • SciBERT trained with English language is converted into Korean langauge using Wechsel technique.
  • Korean tokenizer is selected with KLUE PLMs' tokenizers due to its similar vocab size(32000) and performance.

Reference

Downloads last month
5
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.