jonatasgrosman
commited on
Commit
•
964ddc3
1
Parent(s):
f54a36a
first commit
Browse files- README.md +156 -0
- config.json +76 -0
- preprocessor_config.json +8 -0
- pytorch_model.bin +3 -0
- special_tokens_map.json +1 -0
- vocab.json +1 -0
README.md
ADDED
@@ -0,0 +1,156 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language: ja
|
3 |
+
datasets:
|
4 |
+
- common_voice
|
5 |
+
metrics:
|
6 |
+
- wer
|
7 |
+
- cer
|
8 |
+
tags:
|
9 |
+
- audio
|
10 |
+
- automatic-speech-recognition
|
11 |
+
- speech
|
12 |
+
- xlsr-fine-tuning-week
|
13 |
+
license: apache-2.0
|
14 |
+
model-index:
|
15 |
+
- name: XLSR Wav2Vec2 Japanese by Jonatas Grosman
|
16 |
+
results:
|
17 |
+
- task:
|
18 |
+
name: Speech Recognition
|
19 |
+
type: automatic-speech-recognition
|
20 |
+
dataset:
|
21 |
+
name: Common Voice ja
|
22 |
+
type: common_voice
|
23 |
+
args: ja
|
24 |
+
metrics:
|
25 |
+
- name: Test WER
|
26 |
+
type: wer
|
27 |
+
value: 93.35
|
28 |
+
- name: Test CER
|
29 |
+
type: cer
|
30 |
+
value: 29.24
|
31 |
+
---
|
32 |
+
|
33 |
+
# Wav2Vec2-Large-XLSR-53-Japanese
|
34 |
+
|
35 |
+
Fine-tuned [facebook/wav2vec2-large-xlsr-53](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) on Japanese using the [Common Voice](https://huggingface.co/datasets/common_voice) and [CSS10](https://github.com/Kyubyong/css10).
|
36 |
+
When using this model, make sure that your speech input is sampled at 16kHz.
|
37 |
+
|
38 |
+
The script used for training can be found here: https://github.com/jonatasgrosman/wav2vec2-sprint
|
39 |
+
|
40 |
+
## Usage
|
41 |
+
|
42 |
+
The model can be used directly (without a language model) as follows:
|
43 |
+
|
44 |
+
```python
|
45 |
+
import torch
|
46 |
+
import librosa
|
47 |
+
from datasets import load_dataset
|
48 |
+
from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
|
49 |
+
|
50 |
+
LANG_ID = "ja"
|
51 |
+
MODEL_ID = "jonatasgrosman/wav2vec2-large-xlsr-53-japanese"
|
52 |
+
SAMPLES = 5
|
53 |
+
|
54 |
+
test_dataset = load_dataset("common_voice", LANG_ID, split=f"test[:{SAMPLES}]")
|
55 |
+
|
56 |
+
processor = Wav2Vec2Processor.from_pretrained(MODEL_ID)
|
57 |
+
model = Wav2Vec2ForCTC.from_pretrained(MODEL_ID)
|
58 |
+
|
59 |
+
# Preprocessing the datasets.
|
60 |
+
# We need to read the audio files as arrays
|
61 |
+
def speech_file_to_array_fn(batch):
|
62 |
+
speech_array, sampling_rate = librosa.load(batch["path"], sr=16_000)
|
63 |
+
batch["speech"] = speech_array
|
64 |
+
batch["sentence"] = batch["sentence"].upper()
|
65 |
+
return batch
|
66 |
+
|
67 |
+
test_dataset = test_dataset.map(speech_file_to_array_fn)
|
68 |
+
inputs = processor(test_dataset["speech"], sampling_rate=16_000, return_tensors="pt", padding=True)
|
69 |
+
|
70 |
+
with torch.no_grad():
|
71 |
+
logits = model(inputs.input_values, attention_mask=inputs.attention_mask).logits
|
72 |
+
|
73 |
+
predicted_ids = torch.argmax(logits, dim=-1)
|
74 |
+
predicted_sentences = processor.batch_decode(predicted_ids)
|
75 |
+
|
76 |
+
for i, predicted_sentence in enumerate(predicted_sentences):
|
77 |
+
print("-" * 100)
|
78 |
+
print("Reference:", test_dataset[i]["sentence"])
|
79 |
+
print("Prediction:", predicted_sentence)
|
80 |
+
```
|
81 |
+
|
82 |
+
| Reference | Prediction |
|
83 |
+
| ------------- | ------------- |
|
84 |
+
| 祖母は、おおむね機嫌よく、サイコロをころがしている。 | 都ぼは重い記念よくさいこところがしている |
|
85 |
+
| 財布をなくしたので、交番へ行きます。 | 財布王なクしたので、交番へへ行きます す |
|
86 |
+
| 飲み屋のおやじ、旅館の主人、医者をはじめ、交際のある人にきいてまわったら、みんな、私より収入が多いはずなのに、税金は安い。 | ノみアのやじ、旅館の筋時に、医者を初め、交際なる人に聞いて廻ったら、みんな、私しより周入が多い弾ず脱に、制金は安すい |
|
87 |
+
| 新しい靴をはいて出かけます。 | 新しに靴をはいてかけます |
|
88 |
+
| このためプラズマ中のイオンや電子の持つ平均運動エネルギーを温度で表現することがある | このため、プラズマ中の医本や、電手のもつ平均運動をエネルギーを穏<unk>で、表現することがある |
|
89 |
+
|
90 |
+
## Evaluation
|
91 |
+
|
92 |
+
The model can be evaluated as follows on the Japanese test data of Common Voice.
|
93 |
+
|
94 |
+
```python
|
95 |
+
import torch
|
96 |
+
import re
|
97 |
+
import librosa
|
98 |
+
from datasets import load_dataset, load_metric
|
99 |
+
from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
|
100 |
+
|
101 |
+
LANG_ID = "ja"
|
102 |
+
MODEL_ID = "jonatasgrosman/wav2vec2-large-xlsr-53-japanese"
|
103 |
+
DEVICE = "cuda"
|
104 |
+
MAX_SAMPLES = 8000
|
105 |
+
|
106 |
+
CHARS_TO_IGNORE = [",", "?", "¿", ".", "!", "¡", ";", ":", '""', "%", '"', "�", "ʿ", "·", "჻", "~", "՞",
|
107 |
+
"؟", "،", "।", "॥", "«", "»", "„", "“", "”", "「", "」", "‘", "’", "《", "》", "(", ")", "[", "]",
|
108 |
+
"=", "`", "_", "+", "<", ">", "…", "–", "°", "´", "ʾ", "‹", "›", "©", "®", "—", "→", "。"]
|
109 |
+
|
110 |
+
test_dataset = load_dataset("common_voice", LANG_ID, split="test")
|
111 |
+
if len(test_dataset) > MAX_SAMPLES:
|
112 |
+
test_dataset = test_dataset.select(range(MAX_SAMPLES))
|
113 |
+
|
114 |
+
wer = load_metric("wer.py") # https://github.com/jonatasgrosman/wav2vec2-sprint/blob/main/wer.py
|
115 |
+
cer = load_metric("cer.py") # https://github.com/jonatasgrosman/wav2vec2-sprint/blob/main/cer.py
|
116 |
+
|
117 |
+
chars_to_ignore_regex = f"[{re.escape(''.join(CHARS_TO_IGNORE))}]"
|
118 |
+
|
119 |
+
processor = Wav2Vec2Processor.from_pretrained(MODEL_ID)
|
120 |
+
model = Wav2Vec2ForCTC.from_pretrained(MODEL_ID)
|
121 |
+
model.to(DEVICE)
|
122 |
+
|
123 |
+
# Preprocessing the datasets.
|
124 |
+
# We need to read the audio files as arrays
|
125 |
+
def speech_file_to_array_fn(batch):
|
126 |
+
with warnings.catch_warnings():
|
127 |
+
warnings.simplefilter("ignore")
|
128 |
+
speech_array, sampling_rate = librosa.load(batch["path"], sr=16_000)
|
129 |
+
batch["speech"] = speech_array
|
130 |
+
batch["sentence"] = re.sub(chars_to_ignore_regex, "", batch["sentence"]).upper()
|
131 |
+
return batch
|
132 |
+
|
133 |
+
test_dataset = test_dataset.map(speech_file_to_array_fn)
|
134 |
+
|
135 |
+
# Preprocessing the datasets.
|
136 |
+
# We need to read the audio files as arrays
|
137 |
+
def evaluate(batch):
|
138 |
+
inputs = processor(batch["speech"], sampling_rate=16_000, return_tensors="pt", padding=True)
|
139 |
+
|
140 |
+
with torch.no_grad():
|
141 |
+
logits = model(inputs.input_values.to(DEVICE), attention_mask=inputs.attention_mask.to(DEVICE)).logits
|
142 |
+
|
143 |
+
pred_ids = torch.argmax(logits, dim=-1)
|
144 |
+
batch["pred_strings"] = processor.batch_decode(pred_ids)
|
145 |
+
return batch
|
146 |
+
|
147 |
+
result = test_dataset.map(evaluate, batched=True, batch_size=8)
|
148 |
+
|
149 |
+
print("WER: {:2f}".format(100 * wer.compute(predictions=result["pred_strings"], references=result["sentence"], chunk_size=1000)))
|
150 |
+
print("CER: {:2f}".format(100 * cer.compute(predictions=result["pred_strings"], references=result["sentence"], chunk_size=1000)))
|
151 |
+
```
|
152 |
+
|
153 |
+
**Test Result**:
|
154 |
+
|
155 |
+
- WER: 93.35%
|
156 |
+
- CER: 29.24%
|
config.json
ADDED
@@ -0,0 +1,76 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"_name_or_path": "facebook/wav2vec2-large-xlsr-53",
|
3 |
+
"activation_dropout": 0.05,
|
4 |
+
"apply_spec_augment": true,
|
5 |
+
"architectures": [
|
6 |
+
"Wav2Vec2ForCTC"
|
7 |
+
],
|
8 |
+
"attention_dropout": 0.1,
|
9 |
+
"bos_token_id": 1,
|
10 |
+
"conv_bias": true,
|
11 |
+
"conv_dim": [
|
12 |
+
512,
|
13 |
+
512,
|
14 |
+
512,
|
15 |
+
512,
|
16 |
+
512,
|
17 |
+
512,
|
18 |
+
512
|
19 |
+
],
|
20 |
+
"conv_kernel": [
|
21 |
+
10,
|
22 |
+
3,
|
23 |
+
3,
|
24 |
+
3,
|
25 |
+
3,
|
26 |
+
2,
|
27 |
+
2
|
28 |
+
],
|
29 |
+
"conv_stride": [
|
30 |
+
5,
|
31 |
+
2,
|
32 |
+
2,
|
33 |
+
2,
|
34 |
+
2,
|
35 |
+
2,
|
36 |
+
2
|
37 |
+
],
|
38 |
+
"ctc_loss_reduction": "mean",
|
39 |
+
"ctc_zero_infinity": true,
|
40 |
+
"do_stable_layer_norm": true,
|
41 |
+
"eos_token_id": 2,
|
42 |
+
"feat_extract_activation": "gelu",
|
43 |
+
"feat_extract_dropout": 0.0,
|
44 |
+
"feat_extract_norm": "layer",
|
45 |
+
"feat_proj_dropout": 0.05,
|
46 |
+
"final_dropout": 0.0,
|
47 |
+
"gradient_checkpointing": true,
|
48 |
+
"hidden_act": "gelu",
|
49 |
+
"hidden_dropout": 0.05,
|
50 |
+
"hidden_size": 1024,
|
51 |
+
"initializer_range": 0.02,
|
52 |
+
"intermediate_size": 4096,
|
53 |
+
"layer_norm_eps": 1e-05,
|
54 |
+
"layerdrop": 0.05,
|
55 |
+
"mask_channel_length": 10,
|
56 |
+
"mask_channel_min_space": 1,
|
57 |
+
"mask_channel_other": 0.0,
|
58 |
+
"mask_channel_prob": 0.0,
|
59 |
+
"mask_channel_selection": "static",
|
60 |
+
"mask_feature_length": 10,
|
61 |
+
"mask_feature_prob": 0.0,
|
62 |
+
"mask_time_length": 10,
|
63 |
+
"mask_time_min_space": 1,
|
64 |
+
"mask_time_other": 0.0,
|
65 |
+
"mask_time_prob": 0.05,
|
66 |
+
"mask_time_selection": "static",
|
67 |
+
"model_type": "wav2vec2",
|
68 |
+
"num_attention_heads": 16,
|
69 |
+
"num_conv_pos_embedding_groups": 16,
|
70 |
+
"num_conv_pos_embeddings": 128,
|
71 |
+
"num_feat_extract_layers": 7,
|
72 |
+
"num_hidden_layers": 24,
|
73 |
+
"pad_token_id": 0,
|
74 |
+
"transformers_version": "4.5.0.dev0",
|
75 |
+
"vocab_size": 1767
|
76 |
+
}
|
preprocessor_config.json
ADDED
@@ -0,0 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"do_normalize": true,
|
3 |
+
"feature_size": 1,
|
4 |
+
"padding_side": "right",
|
5 |
+
"padding_value": 0.0,
|
6 |
+
"return_attention_mask": true,
|
7 |
+
"sampling_rate": 16000
|
8 |
+
}
|
pytorch_model.bin
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:36c88545ca77951dd843524c4bf769f129aaa1ea99cfdafe1b1ca0c77eef1e32
|
3 |
+
size 1269178519
|
special_tokens_map.json
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
{"bos_token": "<s>", "eos_token": "</s>", "unk_token": "<unk>", "pad_token": "<pad>"}
|
vocab.json
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
{"<pad>": 0, "<s>": 1, "</s>": 2, "<unk>": 3, "|": 4, "然": 5, "政": 6, "美": 7, "脅": 8, "痕": 9, "胸": 10, "ほ": 11, "陳": 12, "拾": 13, "被": 14, "暮": 15, "腮": 16, "潰": 17, "群": 18, "難": 19, "承": 20, "技": 21, "歴": 22, "済": 23, "棚": 24, "埋": 25, "刻": 26, "差": 27, "訳": 28, "写": 29, "の": 30, "飯": 31, "ざ": 32, "忍": 33, "不": 34, "燥": 35, "倹": 36, "レ": 37, "積": 38, "察": 39, "教": 40, "防": 41, "成": 42, "季": 43, "退": 44, "締": 45, "礼": 46, "肯": 47, "哀": 48, "繕": 49, "寺": 50, "ガ": 51, "餓": 52, "類": 53, "置": 54, "役": 55, "侮": 56, "迷": 57, "概": 58, "ぽ": 59, "狩": 60, "庇": 61, "味": 62, "勢": 63, "冴": 64, "衛": 65, "六": 66, "名": 67, "寒": 68, "要": 69, "迂": 70, "濯": 71, "選": 72, "卓": 73, "所": 74, "扱": 75, "牛": 76, "状": 77, "旋": 78, "底": 79, "べ": 80, "堅": 81, "銀": 82, "蒼": 83, "布": 84, "ャ": 85, "幅": 86, "兄": 87, "袋": 88, "翌": 89, "宇": 90, "婢": 91, "席": 92, "設": 93, "答": 94, "恰": 95, "暖": 96, "筋": 97, "緒": 98, "払": 99, "息": 100, "齢": 101, "震": 102, "族": 103, "那": 104, "僕": 105, "由": 106, "莨": 107, "派": 108, "徐": 109, "疼": 110, "賛": 111, "翳": 112, "弁": 113, "が": 114, "挨": 115, "筒": 116, "継": 117, "暴": 118, "景": 119, "却": 120, "堤": 121, "遣": 122, "性": 123, "這": 124, "犬": 125, "ぞ": 126, "業": 127, "訓": 128, "こ": 129, "私": 130, "約": 131, "ち": 132, "暑": 133, "粧": 134, "官": 135, "引": 136, "呂": 137, "マ": 138, "飾": 139, "を": 140, "骨": 141, "棒": 142, "頁": 143, "子": 144, "豊": 145, "戸": 146, "改": 147, "告": 148, "谷": 149, "袍": 150, "棉": 151, "習": 152, "塩": 153, "罪": 154, "内": 155, "堀": 156, "比": 157, "腑": 158, "己": 159, "屑": 160, "魚": 161, "寓": 162, "樹": 163, "動": 164, "呑": 165, "板": 166, "批": 167, "拍": 168, "幾": 169, "爺": 170, "弓": 171, "離": 172, "缶": 173, "コ": 174, "甜": 175, "洗": 176, "刊": 177, "し": 178, "簾": 179, "毛": 180, "瀬": 181, "省": 182, "漁": 183, "郊": 184, "敗": 185, "娘": 186, "欧": 187, "転": 188, "諸": 189, "般": 190, "慕": 191, "膜": 192, "断": 193, "癖": 194, "怒": 195, "因": 196, "肺": 197, "満": 198, "巣": 199, "ス": 200, "扇": 201, "掠": 202, "番": 203, "袖": 204, "啓": 205, "襟": 206, "温": 207, "籠": 208, "止": 209, "ブ": 210, "ぼ": 211, "膿": 212, "?": 213, "乱": 214, "み": 215, "神": 216, "お": 217, "島": 218, "濶": 219, "奇": 220, "暇": 221, "猫": 222, "各": 223, "距": 224, "伜": 225, "姿": 226, "俺": 227, "四": 228, "園": 229, "ゃ": 230, "炒": 231, "扶": 232, "為": 233, "到": 234, "絹": 235, "ケ": 236, "嚼": 237, "研": 238, "勤": 239, ")": 240, "話": 241, "イ": 242, "と": 243, "傾": 244, "入": 245, "祟": 246, "天": 247, "妻": 248, "眉": 249, "形": 250, "秘": 251, "練": 252, "漏": 253, "惜": 254, "独": 255, "雄": 256, "セ": 257, "繊": 258, "激": 259, "尚": 260, "回": 261, "頓": 262, "額": 263, "葢": 264, "史": 265, "野": 266, "売": 267, "余": 268, "良": 269, "起": 270, "縛": 271, "軽": 272, "驚": 273, "歯": 274, "喧": 275, "茸": 276, "現": 277, "七": 278, "う": 279, "零": 280, "ダ": 281, "ギ": 282, "村": 283, "土": 284, "巻": 285, "宜": 286, "り": 287, "伸": 288, "双": 289, "卵": 290, "築": 291, "者": 292, "炭": 293, "勝": 294, "妾": 295, "籍": 296, "ネ": 297, "図": 298, "堪": 299, "て": 300, "経": 301, "ご": 302, "赤": 303, "巫": 304, "酷": 305, "焉": 306, "校": 307, "女": 308, "泳": 309, "刀": 310, "玩": 311, "忙": 312, "叮": 313, "早": 314, "度": 315, "ぬ": 316, "妙": 317, "緩": 318, "木": 319, "躇": 320, "皺": 321, "惹": 322, "踊": 323, "わ": 324, "千": 325, "雨": 326, "眼": 327, "寧": 328, "な": 329, "テ": 330, "肌": 331, "部": 332, "鹸": 333, "劫": 334, "少": 335, "釈": 336, "感": 337, "臥": 338, "エ": 339, "太": 340, "十": 341, "嵩": 342, "曜": 343, "駁": 344, "窒": 345, "祷": 346, "命": 347, "突": 348, "反": 349, "盾": 350, "君": 351, "欠": 352, "盆": 353, "広": 354, "載": 355, "貸": 356, "通": 357, "頼": 358, "脹": 359, "寿": 360, "ぴ": 361, "篏": 362, "可": 363, "紳": 364, "禰": 365, "吐": 366, "服": 367, "祖": 368, "機": 369, "責": 370, "英": 371, "襲": 372, "擱": 373, "洋": 374, "沢": 375, "看": 376, "分": 377, "遂": 378, "罵": 379, "制": 380, "曲": 381, "魔": 382, "嚢": 383, "、": 384, "僧": 385, "婚": 386, "殊": 387, "属": 388, "憶": 389, "朱": 390, "財": 391, "情": 392, "ま": 393, "容": 394, "拭": 395, "希": 396, "昔": 397, "懇": 398, "暗": 399, "落": 400, "ワ": 401, "店": 402, "互": 403, "爆": 404, "期": 405, "音": 406, "靠": 407, "卒": 408, "古": 409, "復": 410, "砲": 411, "留": 412, "足": 413, "聖": 414, "補": 415, "我": 416, "利": 417, "参": 418, "穿": 419, "稿": 420, "午": 421, "嬌": 422, "小": 423, "久": 424, "御": 425, "ク": 426, "慌": 427, "ー": 428, "交": 429, "蔵": 430, "倚": 431, "解": 432, "加": 433, "��": 434, "講": 435, "遅": 436, "大": 437, "瞥": 438, "汰": 439, "疎": 440, "貧": 441, "酒": 442, "冊": 443, "高": 444, "界": 445, "臨": 446, "執": 447, "時": 448, "跡": 449, "豆": 450, "排": 451, "張": 452, "遇": 453, "バ": 454, "投": 455, "着": 456, "箪": 457, "森": 458, "整": 459, "妹": 460, "密": 461, "予": 462, "鳥": 463, "描": 464, "装": 465, "ゅ": 466, "黒": 467, "秩": 468, "嘆": 469, "倒": 470, "媚": 471, "評": 472, "幼": 473, "凸": 474, "存": 475, "販": 476, "燐": 477, "蒲": 478, "俵": 479, "悪": 480, "位": 481, "信": 482, "云": 483, "婦": 484, "揶": 485, "肘": 486, "杯": 487, "紫": 488, "菓": 489, "戒": 490, "徒": 491, "頑": 492, "監": 493, "噂": 494, "代": 495, "淀": 496, "ば": 497, "沙": 498, "く": 499, "含": 500, "近": 501, "タ": 502, "二": 503, "嬉": 504, "縁": 505, "束": 506, "鮮": 507, "下": 508, "鼬": 509, "星": 510, "檜": 511, "届": 512, "争": 513, "ラ": 514, "る": 515, "幻": 516, "揉": 517, "咀": 518, "結": 519, "育": 520, "恢": 521, "涙": 522, "鎖": 523, "踞": 524, "畑": 525, "院": 526, "歳": 527, "手": 528, "員": 529, "ア": 530, "長": 531, "当": 532, "ぶ": 533, "箔": 534, "振": 535, "悲": 536, "捕": 537, "鶏": 538, "和": 539, "船": 540, "衆": 541, "響": 542, "値": 543, "連": 544, "冗": 545, "降": 546, "灰": 547, "恵": 548, "で": 549, "モ": 550, "館": 551, "套": 552, "生": 553, "偶": 554, "収": 555, "つ": 556, "係": 557, "械": 558, "秋": 559, "細": 560, "困": 561, "宙": 562, "笠": 563, "ィ": 564, "流": 565, "唇": 566, "堂": 567, "人": 568, "袴": 569, "襦": 570, "飛": 571, "脱": 572, "偽": 573, "唆": 574, "偉": 575, "善": 576, "巧": 577, "瘢": 578, "墓": 579, "最": 580, "え": 581, "郷": 582, "雑": 583, "領": 584, "扉": 585, "型": 586, "判": 587, "癒": 588, "愚": 589, "団": 590, "賃": 591, "刺": 592, "未": 593, "爛": 594, "ら": 595, "乏": 596, "厭": 597, "ょ": 598, "題": 599, "怪": 600, "章": 601, "演": 602, "空": 603, "眠": 604, "航": 605, "白": 606, "思": 607, "輸": 608, "沼": 609, "仏": 610, "買": 611, "嚀": 612, "メ": 613, "節": 614, "必": 615, "忠": 616, "擽": 617, "料": 618, "窟": 619, "ソ": 620, "蟠": 621, "吏": 622, "酌": 623, "年": 624, "治": 625, "条": 626, "芝": 627, "勉": 628, "俎": 629, "は": 630, "ボ": 631, "陰": 632, "拘": 633, "羨": 634, "正": 635, "盗": 636, "鎮": 637, "宝": 638, "招": 639, "笑": 640, "唐": 641, "緑": 642, "合": 643, "虚": 644, "据": 645, "遍": 646, "楽": 647, "狂": 648, "穴": 649, "眺": 650, "織": 651, "尨": 652, "営": 653, "適": 654, "傷": 655, "封": 656, "或": 657, "沈": 658, "背": 659, "申": 660, "伝": 661, "池": 662, "遮": 663, "特": 664, "詩": 665, "幸": 666, "誕": 667, "凡": 668, "ヒ": 669, "鰻": 670, "崩": 671, "術": 672, "避": 673, "捨": 674, "換": 675, "緊": 676, "灯": 677, "へ": 678, "柿": 679, "肩": 680, "沓": 681, "完": 682, "歩": 683, "固": 684, "映": 685, "寄": 686, "上": 687, "貰": 688, " ": 689, "一": 690, "別": 691, "め": 692, "街": 693, "託": 694, "苦": 695, "過": 696, "・": 697, "三": 698, "屋": 699, "添": 700, "ッ": 701, "ぎ": 702, "裁": 703, "果": 704, "学": 705, "毒": 706, "盛": 707, "熱": 708, "潜": 709, "吹": 710, "与": 711, "ザ": 712, "揺": 713, "運": 714, "抽": 715, "税": 716, "備": 717, "威": 718, "透": 719, "嗟": 720, "凧": 721, "接": 722, "夕": 723, "ず": 724, "康": 725, "氷": 726, "ひ": 727, "佇": 728, "羽": 729, "蹴": 730, "世": 731, "葬": 732, "紅": 733, "論": 734, "彎": 735, "―": 736, "央": 737, "均": 738, "苗": 739, "徴": 740, "迎": 741, "ュ": 742, "共": 743, "栄": 744, "登": 745, "咄": 746, "霧": 747, "能": 748, "停": 749, "端": 750, "旅": 751, "黄": 752, "元": 753, "巾": 754, "詞": 755, "働": 756, "至": 757, "や": 758, "炉": 759, "衝": 760, "皮": 761, "実": 762, "嘩": 763, "管": 764, "害": 765, "繁": 766, "徽": 767, "い": 768, "盲": 769, "観": 770, "柱": 771, "町": 772, "リ": 773, "貯": 774, "彩": 775, "弾": 776, "姓": 777, "攻": 778, "八": 779, "仰": 780, "鼻": 781, "闇": 782, "ョ": 783, "場": 784, "黙": 785, "及": 786, "精": 787, "飄": 788, "邪": 789, "鉄": 790, "奪": 791, "毎": 792, "杖": 793, "電": 794, "巡": 795, "纏": 796, "騙": 797, "占": 798, "括": 799, "瞼": 800, "贅": 801, "抱": 802, "裂": 803, "権": 804, "鯨": 805, "顔": 806, "玉": 807, "言": 808, "屈": 809, "せ": 810, "視": 811, "逃": 812, "閑": 813, "腫": 814, "球": 815, "川": 816, "辞": 817, "字": 818, "耳": 819, "尽": 820, "ど": 821, "罹": 822, "唸": 823, "慰": 824, "養": 825, "斥": 826, "弟": 827, "曖": 828, "等": 829, "序": 830, "去": 831, "次": 832, "陸": 833, "横": 834, "潤": 835, "傘": 836, "便": 837, "携": 838, "崎": 839, "資": 840, "刃": 841, "符": 842, "水": 843, "初": 844, "掻": 845, "半": 846, "候": 847, "ウ": 848, "朧": 849, "在": 850, "薬": 851, "儀": 852, "句": 853, "渦": 854, "沸": 855, "療": 856, "簡": 857, "以": 858, "異": 859, "帽": 860, "個": 861, "ト": 862, "垣": 863, "嘘": 864, "提": 865, "び": 866, "客": 867, "決": 868, "昇": 869, "渉": 870, "叫": 871, "会": 872, "枚": 873, "米": 874, "点": 875, "旧": 876, "賑": 877, "風": 878, "功": 879, "掃": 880, "劃": 881, "工": 882, "書": 883, "純": 884, "惑": 885, "階": 886, "蔑": 887, "撃": 888, "略": 889, "狙": 890, "前": 891, "域": 892, "賺": 893, "車": 894, "夏": 895, "理": 896, "常": 897, "否": 898, "往": 899, "藤": 900, "郵": 901, "揮": 902, "ろ": 903, "銃": 904, "軒": 905, "返": 906, "慮": 907, "奴": 908, "認": 909, "疲": 910, "摺": 911, "恣": 912, "ド": 913, "鉢": 914, "拳": 915, "是": 916, "平": 917, "火": 918, "充": 919, "西": 920, "念": 921, "葉": 922, "翻": 923, "油": 924, "駅": 925, "路": 926, "影": 927, "海": 928, "訪": 929, "抒": 930, "月": 931, "号": 932, "報": 933, "澄": 934, "件": 935, "画": 936, "妃": 937, "新": 938, "跳": 939, "的": 940, "麻": 941, "拙": 942, "酬": 943, "喜": 944, "心": 945, "単": 947, "語": 948, "借": 949, "較": 950, "刹": 951, "庁": 952, "宥": 953, "ポ": 954, "鳶": 955, "頭": 956, "皿": 957, "塀": 958, "境": 959, "馳": 960, "ぱ": 961, "づ": 962, "岬": 963, "銘": 964, "ル": 965, "医": 966, "夜": 967, "煉": 968, "価": 969, "ェ": 970, "チ": 971, "切": 972, "放": 973, "受": 974, "姉": 975, "王": 976, "碗": 977, "ふ": 978, "日": 979, "議": 980, "好": 981, "失": 982, "包": 983, "児": 984, "対": 985, "面": 986, "府": 987, "待": 988, "井": 989, "任": 990, "だ": 991, "紺": 992, "倦": 993, "萄": 994, "飴": 995, "笛": 996, "造": 997, "続": 998, "酪": 999, "ユ": 1000, "綴": 1001, "劇": 1002, "滑": 1003, "縞": 1004, "列": 1005, "途": 1006, "焼": 1007, "珠": 1008, "疑": 1009, "ピ": 1010, "障": 1011, "道": 1012, "請": 1013, "旨": 1014, "か": 1015, "器": 1016, "愉": 1017, "プ": 1018, "確": 1019, "艇": 1020, "懐": 1021, "銭": 1022, "座": 1023, "軌": 1024, "露": 1025, "標": 1026, "誠": 1027, "荷": 1028, "覆": 1029, "勘": 1030, "恥": 1031, "菌": 1032, "襖": 1033, "ヤ": 1034, "知": 1035, "冒": 1036, "げ": 1037, "散": 1038, "脳": 1039, "原": 1040, "懸": 1041, "夢": 1042, "晴": 1043, "波": 1044, "算": 1045, "押": 1046, "深": 1047, "シ": 1048, "石": 1049, "袢": 1050, "ヴ": 1051, "傍": 1052, "痴": 1053, "紛": 1054, "残": 1055, "羊": 1056, "匠": 1057, "従": 1058, "素": 1059, "跋": 1060, "肴": 1061, "埒": 1062, "擦": 1063, "達": 1064, "寸": 1065, "帳": 1066, "札": 1067, "渡": 1068, "青": 1069, "忘": 1070, "労": 1071, "筆": 1072, "他": 1073, "許": 1074, "鈴": 1075, "超": 1076, "若": 1077, "隠": 1078, "富": 1079, "キ": 1080, "欲": 1081, "込": 1082, "促": 1083, "肱": 1084, "濃": 1085, "死": 1086, "品": 1087, "矜": 1088, "粋": 1089, ".": 1090, "卑": 1091, "裡": 1092, "友": 1093, "進": 1094, "晩": 1095, "腹": 1096, "叔": 1097, "開": 1098, "延": 1099, "職": 1100, "金": 1101, "誘": 1102, "多": 1103, "消": 1104, "麗": 1105, "五": 1106, "率": 1107, "掘": 1108, "松": 1109, "本": 1110, "浴": 1111, "競": 1112, "矛": 1113, "ベ": 1114, "控": 1115, "易": 1116, "何": 1117, "給": 1118, "け": 1119, "仲": 1120, "倍": 1121, "宮": 1122, "説": 1123, "宵": 1124, "来": 1125, "寝": 1126, "相": 1127, "倫": 1128, "華": 1129, "痛": 1130, "指": 1131, "効": 1132, "囲": 1133, "取": 1134, "末": 1135, "耐": 1136, "負": 1137, "使": 1138, "畳": 1139, "几": 1140, "綺": 1141, "腋": 1142, "間": 1143, "証": 1144, "ニ": 1145, "仕": 1146, "膳": 1147, "燦": 1148, "咎": 1149, "紙": 1150, "砂": 1151, "務": 1152, "嫌": 1153, "乳": 1154, "移": 1155, "警": 1156, "碁": 1157, "雲": 1158, "短": 1159, "枝": 1160, "パ": 1161, "拠": 1162, "朝": 1163, "主": 1164, "父": 1165, "此": 1166, "変": 1167, "九": 1168, "斯": 1169, "藩": 1170, "令": 1171, "膨": 1172, "模": 1173, "身": 1174, "遠": 1175, "量": 1176, "葡": 1177, "蔽": 1178, "針": 1179, "滴": 1180, "依": 1181, "窮": 1182, "荒": 1183, "男": 1184, "ズ": 1185, "瞳": 1186, "印": 1187, "家": 1188, "円": 1189, "快": 1190, "む": 1191, "融": 1192, "庭": 1193, "昧": 1194, "案": 1195, "麦": 1196, "髪": 1197, "供": 1198, "口": 1199, "フ": 1200, "阿": 1201, "殿": 1202, "ん": 1203, "熊": 1204, "診": 1205, "救": 1206, "源": 1207, "望": 1208, "台": 1209, "飲": 1210, "記": 1211, "増": 1212, "材": 1213, "紋": 1214, "ナ": 1215, "課": 1216, "盃": 1217, "挟": 1218, "体": 1219, "刈": 1220, "活": 1221, "侵": 1222, "グ": 1223, "損": 1224, "舗": 1225, "帰": 1226, "牽": 1227, "百": 1228, "握": 1229, "賀": 1230, "ゼ": 1231, "製": 1232, "竪": 1233, "玄": 1234, "咲": 1235, "隆": 1236, "浚": 1237, "格": 1238, "ァ": 1239, "鄭": 1240, "塞": 1241, "鳴": 1242, "諾": 1243, "奏": 1244, "作": 1245, "詳": 1246, "戯": 1247, "聞": 1248, "催": 1249, "惚": 1250, "春": 1251, "吸": 1252, "貴": 1253, "脚": 1254, "麺": 1255, "隔": 1256, "全": 1257, "挙": 1258, "直": 1259, "縮": 1260, "軍": 1261, "錨": 1262, "ビ": 1263, "曇": 1264, "縫": 1265, "況": 1266, "替": 1267, "破": 1268, "河": 1269, "凹": 1270, "椅": 1271, "廻": 1272, "搬": 1273, "永": 1274, "焦": 1275, "匂": 1276, "っ": 1277, "淡": 1278, "走": 1279, "栓": 1280, "越": 1281, "産": 1282, "戦": 1283, "致": 1284, "湾": 1285, "施": 1286, "触": 1287, "褞": 1288, "カ": 1289, "守": 1290, "誌": 1291, "始": 1292, "ヌ": 1293, "数": 1294, "欄": 1295, "竹": 1296, "戻": 1297, "漆": 1298, "同": 1299, "配": 1300, "田": 1301, "行": 1302, "拵": 1303, "角": 1304, "き": 1305, "挿": 1306, "吉": 1307, "穏": 1308, "構": 1309, "ゲ": 1310, "訊": 1311, "坐": 1312, "専": 1313, "植": 1314, "付": 1315, "強": 1316, "ミ": 1317, "支": 1318, "注": 1319, "力": 1320, "腰": 1321, "保": 1322, "住": 1323, "射": 1324, "仙": 1325, "導": 1326, "没": 1327, "覗": 1328, "順": 1329, "随": 1330, "厠": 1331, "撮": 1332, "声": 1333, "雪": 1334, "科": 1335, "談": 1336, "稽": 1337, "求": 1338, "袂": 1339, "似": 1340, "明": 1341, "津": 1342, "す": 1343, "法": 1344, "ジ": 1345, "汚": 1346, "禁": 1347, "病": 1348, "頃": 1349, "靴": 1350, "湯": 1351, "立": 1352, "種": 1353, "佐": 1354, "疾": 1355, "舶": 1356, "腕": 1357, "に": 1358, "泥": 1359, "民": 1360, "じ": 1361, "も": 1362, "歌": 1363, "室": 1364, "調": 1365, "減": 1366, "坂": 1367, "ン": 1368, "れ": 1369, "机": 1370, "柄": 1371, "誰": 1372, "琉": 1373, "絵": 1374, "敷": 1375, "吃": 1376, "橋": 1377, "甥": 1378, "乗": 1379, "伺": 1380, "鉛": 1381, "熨": 1382, "公": 1383, "猟": 1384, "義": 1385, "商": 1386, "険": 1387, "茶": 1388, "釣": 1389, "今": 1390, "氏": 1391, "丸": 1392, "聴": 1393, "縦": 1394, "先": 1395, "蘭": 1396, "送": 1397, "髯": 1398, "局": 1399, "段": 1400, "裕": 1401, "識": 1402, "定": 1403, "裏": 1404, "ね": 1405, "蔭": 1406, "奥": 1407, "脈": 1408, "了": 1409, "跟": 1410, "関": 1411, "出": 1412, "輝": 1413, "応": 1414, "戟": 1415, "事": 1416, "香": 1417, "虫": 1418, "駄": 1419, "ホ": 1420, "鞄": 1421, "笥": 1422, "阪": 1423, "地": 1424, "助": 1425, "国": 1426, "さ": 1427, "馬": 1428, "乞": 1429, "滅": 1430, "そ": 1431, "諦": 1432, "膝": 1433, "オ": 1434, "弱": 1435, "虜": 1436, "ぺ": 1437, "陥": 1438, "ペ": 1439, "普": 1440, "外": 1441, "デ": 1442, "柳": 1443, "逆": 1444, "逼": 1445, "堰": 1446, "ぜ": 1447, "危": 1448, "態": 1449, "親": 1450, "集": 1451, "泊": 1452, "ロ": 1453, "極": 1454, "宅": 1455, "浜": 1456, "割": 1457, "ぐ": 1458, "左": 1459, "凍": 1460, "再": 1461, "折": 1462, "奈": 1463, "準": 1464, "遺": 1465, "験": 1466, "護": 1467, "非": 1468, "捧": 1469, "召": 1470, "塊": 1471, "城": 1472, "跨": 1473, "象": 1474, "睡": 1475, "港": 1476, "持": 1477, "腸": 1478, "壁": 1479, "誂": 1480, "母": 1481, "恋": 1482, "規": 1483, "創": 1484, "真": 1485, "鹿": 1486, "終": 1487, "則": 1488, "林": 1489, "干": 1490, "塗": 1491, "追": 1492, "輪": 1493, "目": 1494, "綾": 1495, "編": 1496, "函": 1497, "冷": 1498, "サ": 1499, "譲": 1500, "岡": 1501, "複": 1502, "迫": 1503, "更": 1504, "気": 1505, "斗": 1506, "後": 1507, "庫": 1508, "瓦": 1509, "鋭": 1510, "煖": 1511, "有": 1512, "呆": 1513, "市": 1514, "士": 1515, "染": 1516, "重": 1517, "願": 1518, "剤": 1519, "花": 1520, "乾": 1521, "厚": 1522, "無": 1523, "唯": 1524, "蹲": 1525, "々": 1526, "楊": 1527, "ツ": 1528, "免": 1529, "質": 1530, "腐": 1531, "里": 1532, "丁": 1533, "万": 1534, "遭": 1535, "硝": 1536, "周": 1537, "煙": 1538, "沿": 1539, "想": 1540, "ヘ": 1541, "毫": 1542, "辺": 1543, "肉": 1544, "健": 1545, "雀": 1546, "昼": 1547, "昨": 1548, "処": 1549, "倉": 1550, "片": 1551, "飽": 1552, "履": 1553, "洩": 1554, "厄": 1555, "愛": 1556, "州": 1557, "匹": 1558, "狭": 1559, "暦": 1560, "自": 1561, "益": 1562, "芸": 1563, "程": 1564, "福": 1565, "計": 1566, "隣": 1567, "抜": 1568, "伴": 1569, "式": 1570, "隅": 1571, "血": 1572, "試": 1573, "静": 1574, "搾": 1575, "首": 1576, "渋": 1577, "安": 1578, "速": 1579, "ノ": 1580, "喰": 1581, "東": 1582, "よ": 1583, "慢": 1584, "揃": 1585, "表": 1586, "併": 1587, "中": 1588, "糸": 1589, "将": 1590, "捻": 1591, "絡": 1592, "恐": 1593, "痢": 1594, "繰": 1595, "化": 1596, "際": 1597, "帯": 1598, "壊": 1599, "像": 1600, "老": 1601, "文": 1602, "閉": 1603, "揚": 1604, "都": 1605, "憚": 1606, "打": 1607, "婿": 1608, "勾": 1609, "患": 1610, "麭": 1611, "浮": 1612, "曝": 1613, "彼": 1614, "丈": 1615, "介": 1616, "得": 1617, "絶": 1618, "核": 1619, "光": 1620, "踏": 1621, "建": 1622, "竦": 1623, "羞": 1624, "詰": 1625, "岸": 1626, "券": 1627, "た": 1628, "鋼": 1629, "側": 1630, "胃": 1631, "症": 1632, "捩": 1633, "限": 1634, "両": 1635, "鏡": 1636, "考": 1637, "掌": 1638, "揄": 1639, "覧": 1640, "邦": 1641, "休": 1642, "溜": 1643, "隊": 1644, "嬲": 1645, "微": 1646, "需": 1647, "社": 1648, "彫": 1649, "企": 1650, "物": 1651, "拡": 1652, "費": 1653, "掬": 1654, "居": 1655, "怖": 1656, "栖": 1657, "湖": 1658, "革": 1659, "��": 1660, "京": 1661, "床": 1662, "夫": 1663, "臓": 1664, "崗": 1665, "坊": 1666, "溌": 1667, "右": 1668, "肥": 1669, "色": 1670, "ゆ": 1671, "修": 1672, "辱": 1673, "努": 1674, "草": 1675, "漢": 1676, "鋳": 1677, "農": 1678, "悩": 1679, "あ": 1680, "ォ": 1681, "捉": 1682, "銅": 1683, "向": 1684, "覚": 1685, "ハ": 1686, "究": 1687, "山": 1688, "窓": 1689, "門": 1690, "襞": 1691, "ム": 1692, "箱": 1693, "泣": 1694, "違": 1695, "方": 1696, "舞": 1697, "掛": 1698, "磨": 1699, "胡": 1700, "示": 1701, "除": 1702, "北": 1703, "呼": 1704, "発": 1705, "辛": 1706, "例": 1707, "意": 1708, "並": 1709, "厳": 1710, "探": 1711, "冬": 1712, "訴": 1713, "撫": 1714, "填": 1715, "薄": 1716, "房": 1717, "食": 1718, "菜": 1719, "用": 1720, "援": 1721, "嫁": 1722, "ヨ": 1723, "急": 1724, "興": 1725, "尺": 1726, "尊": 1727, "第": 1728, "週": 1729, "瓶": 1730, "躊": 1731, "組": 1732, "拶": 1733, "根": 1734, "糖": 1735, "綿": 1736, "伏": 1737, "問": 1738, "司": 1739, "臆": 1740, "授": 1741, "故": 1742, "蟇": 1743, "輩": 1744, "線": 1745, "塁": 1746, "ゴ": 1747, "遊": 1748, "勇": 1749, "宿": 1750, "低": 1751, "瑣": 1752, "珍": 1753, "読": 1754, "様": 1755, "贔": 1756, "見": 1757, "斟": 1758, "遼": 1759, "策": 1760, "担": 1761, "圧": 1762, "(": 1763, "甘": 1764, "具": 1765, "廊": 1766, "燃": 1767}
|