metadata
language: ko
tags:
- whisper
- speech-recognition
datasets:
- maxseats/aihub-464-preprocessed-680GB-set-0
metrics:
- cer
Model Name : SungBeom/whisper-small-ko
Description
- νμΈνλ λ°μ΄ν°μ : maxseats/aihub-464-preprocessed-680GB-set-0
- AI hubμ μ£Όμ μμλ³ νμ μμ± λ°μ΄ν°μ 680GB μ€ μ²«λ²μ§Έ λ°μ΄ν°(10GB)λ₯Ό νμΈνλν λͺ¨λΈμ λλ€.
- λ°μ΄ν°μ λ§ν¬ : https://huggingface.co/datasets/maxseats/aihub-464-preprocessed-680GB-set-0
νλΌλ―Έν°
model_name = "SungBeom/whisper-small-ko" # λμ : "SungBeom/whisper-small-ko"
dataset_name = "maxseats/aihub-464-preprocessed-680GB-set-0" # λΆλ¬μ¬ λ°μ΄ν°μ
(νκΉ
νμ΄μ€ κΈ°μ€)
CACHE_DIR = '/mnt/a/maxseats/.finetuning_cache' # μΊμ λλ ν 리 μ§μ
is_test = False # True: μλμ μν λ°μ΄ν°λ‘ ν
μ€νΈ, False: μ€μ νμΈνλ
token = "hf_" # νκΉ
νμ΄μ€ ν ν° μ
λ ₯
training_args = Seq2SeqTrainingArguments(
output_dir=model_dir, # μνλ 리ν¬μ§ν 리 μ΄λ¦μ μ
λ ₯νλ€.
per_device_train_batch_size=16,
gradient_accumulation_steps=2, # λ°°μΉ ν¬κΈ°κ° 2λ°° κ°μν λλ§λ€ 2λ°°μ© μ¦κ°
learning_rate=1e-5,
warmup_steps=1000,
# max_steps=2, # epoch λμ μ€μ
num_train_epochs=1, # epoch μ μ€μ / max_stepsμ μ΄κ² μ€ νλλ§ μ€μ
gradient_checkpointing=True,
fp16=True,
evaluation_strategy="steps",
per_device_eval_batch_size=16,
predict_with_generate=True,
generation_max_length=225,
save_steps=1000,
eval_steps=1000,
logging_steps=25,
report_to=["tensorboard"],
load_best_model_at_end=True,
metric_for_best_model="cer", # νκ΅μ΄μ κ²½μ° 'wer'보λ€λ 'cer'μ΄ λ μ ν©ν κ²
greater_is_better=False,
push_to_hub=True,
save_total_limit=5, # μ΅λ μ μ₯ν λͺ¨λΈ μ μ§μ
)