Huggingface usage and local usage are different
import torch
from transformers import AutoProcessor, WhisperForConditionalGeneration
from scipy.io import wavfile
processor = AutoProcessor.from_pretrained("emre/whisper-medium-turkish-2")
model = WhisperForConditionalGeneration.from_pretrained("emre/whisper-medium-turkish-2")
samplerate, data = wavfile.read('./audio.wav')
data_s16 = np.frombuffer(data, dtype=np.int16, count=len(data)//2, offset=0)
x_data = data_s16.astype(np.float32, order='C') / 32768.0
inputs = processor(x_data, return_tensors="pt")
input_features = inputs.input_features
generated_ids = model.generate(inputs=input_features)
transcription = processor.batch_decode(generated_ids, skip_special_tokens=True)
print(transcription)
###############################################
(I think I'm using it correctly) (doğru şekilde kullandığımı düşünüyorum)
aynı ses dosyasını yüklememe rağmen Huggingface tam doğru bir sonuç verirken local cihazım saçma saban bir çıktı veriyor
Even though I uploaded the same audio file, Huggingface gives an accurate result, while my local device gives a ridiculous output.
could you share the error?