zenz-v1 Checkpoints

zenz-v1 is a language model specialized for kana-kanji conversion tasks based on the GPT-2 architecture. It is intended for use in the neural kana-kanji conversion system "Zenzai."

This repository publishes the checkpoints for zenz-v1.

90M parameters
Character-level + byte-level BPE tokenizer
High performance in kana-kanji conversion tasks using greedy decoding

Model Details

Model Description

The base model used is ku-nlp/gpt2-small-japanese-char provided under CC-BY-SA 4.0.

This model is provided under CC-BY-SA 4.0.

Developed by: Keita Miwa (𝕏)
Model type: GPT-2
Language(s) (NLP): Japanese
License: CC-BY-SA 4.0
Finetuned from model: ku-nlp/gpt2-small-japanese-char

Model Sources

This model is intended for use with Zenzai (AzooKeyKanaKanjiConverter).

Repository: https://github.com/ensan-hcl/AzooKeyKanaKanjiConverter

Acknowledgements

The following libraries, tools, and language resources were utilized in constructing this model.

MeCab (https://taku910.github.io/mecab/)
ipadic-NEologd (https://github.com/neologd/mecab-ipadic-neologd)
torch (https://pypi.org/project/torch/)
transformers (https://pypi.org/project/transformers/)
datasets (https://pypi.org/project/datasets/)
jaconv (https://pypi.org/project/jaconv/)
llama.cpp (https://github.com/ggerganov/llama.cpp)

Miwa-Keita
/

zenz-v1-checkpoints

zenz-v1 Checkpoints

Model Details

Model Description

Model Sources

Acknowledgements

Spaces using Miwa-Keita/zenz-v1-checkpoints 2