Upload folder using huggingface_hub

Browse files

Files changed (7) hide show

.ipynb_checkpoints/README-checkpoint.md +162 -0
README.md +162 -3
config.json +224 -0
model.bin +3 -0
preprocessor_config.json +14 -0
tokenizer.json +0 -0
vocabulary.json +0 -0

.ipynb_checkpoints/README-checkpoint.md ADDED Viewed

	@@ -0,0 +1,162 @@

+---
+language:
+- en
+- zh
+- de
+- es
+- ru
+- ko
+- fr
+- ja
+- pt
+- tr
+- pl
+- ca
+- nl
+- ar
+- sv
+- it
+- id
+- hi
+- fi
+- vi
+- he
+- uk
+- el
+- ms
+- cs
+- ro
+- da
+- hu
+- ta
+- no
+- th
+- ur
+- hr
+- bg
+- lt
+- la
+- mi
+- ml
+- cy
+- sk
+- te
+- fa
+- lv
+- bn
+- sr
+- az
+- sl
+- kn
+- et
+- mk
+- br
+- eu
+- is
+- hy
+- ne
+- mn
+- bs
+- kk
+- sq
+- sw
+- gl
+- mr
+- pa
+- si
+- km
+- sn
+- yo
+- so
+- af
+- oc
+- ka
+- be
+- tg
+- sd
+- gu
+- am
+- yi
+- lo
+- uz
+- fo
+- ht
+- ps
+- tk
+- nn
+- mt
+- sa
+- lb
+- my
+- bo
+- tl
+- mg
+- as
+- tt
+- haw
+- ln
+- ha
+- ba
+- jw
+- su
+tags:
+  - audio
+  - automatic-speech-recognition
+license: mit
+library_name: ctranslate2
+---
+# Whisper large-v3-turbo model for CTranslate2
+This repository contains the conversion of [openai/whisper-large-v3-turbo](https://huggingface.co/openai/whisper-large-v3-turbo) to the [CTranslate2](https://github.com/OpenNMT/CTranslate2) model format.
+This model can be used in CTranslate2 or projects based on CTranslate2 such as [faster-whisper](https://github.com/systran/faster-whisper).
+## Example with batch inference
+```python
+import time
+from faster_whisper import WhisperModel, BatchedInferencePipeline
+from faster_whisper.audio import decode_audio
+model = WhisperModel("Infomaniak-AI/faster-whisper-large-v3-turbo",
+                     device="cuda",
+                     num_workers=4,
+                     compute_type='float16')
+batch = BatchedInferencePipeline(model=model,
+                                 use_vad_model=True,
+                                 chunk_length=30)
+audio = decode_audio("audio.mp3", sampling_rate=model.feature_extractor.sampling_rate)
+start_time = time.time()
+segment_generator, info = batch.transcribe(audio,
+                                           batch_size=32,
+                                           beam_size=5,
+                                           task="transcribe",
+                                           word_timestamps=True,
+                                           suppress_blank=True)
+segments = []
+text = ""
+for segment in segment_generator:
+    segments.append(segment)
+    text = text + segment.text
+print("--- %s seconds ---" % (time.time() - start_time))
+```
+## Conversion details
+The original model was converted with the following command:
+```
+ct2-transformers-converter --model openai/whisper-large-v3-turbo --output_dir whisper-large-v3-turbo --copy_files tokenizer.json preprocessor_config.json --quantization float16
+```
+Note that the model weights are saved in FP16. This type can be changed when the model is loaded using the [`compute_type` option in CTranslate2](https://opennmt.net/CTranslate2/quantization.html).
+## More information
+**For more information about the original model, see its [model card](https://huggingface.co/openai/whisper-large-v3-turbo).**

README.md CHANGED Viewed

@@ -1,3 +1,162 @@
----
-license: apache-2.0
----

+---
+language:
+- en
+- zh
+- de
+- es
+- ru
+- ko
+- fr
+- ja
+- pt
+- tr
+- pl
+- ca
+- nl
+- ar
+- sv
+- it
+- id
+- hi
+- fi
+- vi
+- he
+- uk
+- el
+- ms
+- cs
+- ro
+- da
+- hu
+- ta
+- no
+- th
+- ur
+- hr
+- bg
+- lt
+- la
+- mi
+- ml
+- cy
+- sk
+- te
+- fa
+- lv
+- bn
+- sr
+- az
+- sl
+- kn
+- et
+- mk
+- br
+- eu
+- is
+- hy
+- ne
+- mn
+- bs
+- kk
+- sq
+- sw
+- gl
+- mr
+- pa
+- si
+- km
+- sn
+- yo
+- so
+- af
+- oc
+- ka
+- be
+- tg
+- sd
+- gu
+- am
+- yi
+- lo
+- uz
+- fo
+- ht
+- ps
+- tk
+- nn
+- mt
+- sa
+- lb
+- my
+- bo
+- tl
+- mg
+- as
+- tt
+- haw
+- ln
+- ha
+- ba
+- jw
+- su
+tags:
+  - audio
+  - automatic-speech-recognition
+license: mit
+library_name: ctranslate2
+---
+# Whisper large-v3-turbo model for CTranslate2
+This repository contains the conversion of [openai/whisper-large-v3-turbo](https://huggingface.co/openai/whisper-large-v3-turbo) to the [CTranslate2](https://github.com/OpenNMT/CTranslate2) model format.
+This model can be used in CTranslate2 or projects based on CTranslate2 such as [faster-whisper](https://github.com/systran/faster-whisper).
+## Example with batch inference
+```python
+import time
+from faster_whisper import WhisperModel, BatchedInferencePipeline
+from faster_whisper.audio import decode_audio
+model = WhisperModel("Infomaniak-AI/faster-whisper-large-v3-turbo",
+                     device="cuda",
+                     num_workers=4,
+                     compute_type='float16')
+batch = BatchedInferencePipeline(model=model,
+                                 use_vad_model=True,
+                                 chunk_length=30)
+audio = decode_audio("audio.mp3", sampling_rate=model.feature_extractor.sampling_rate)
+start_time = time.time()
+segment_generator, info = batch.transcribe(audio,
+                                           batch_size=32,
+                                           beam_size=5,
+                                           task="transcribe",
+                                           word_timestamps=True,
+                                           suppress_blank=True)
+segments = []
+text = ""
+for segment in segment_generator:
+    segments.append(segment)
+    text = text + segment.text
+print("--- %s seconds ---" % (time.time() - start_time))
+```
+## Conversion details
+The original model was converted with the following command:
+```
+ct2-transformers-converter --model openai/whisper-large-v3-turbo --output_dir whisper-large-v3-turbo --copy_files tokenizer.json preprocessor_config.json --quantization float16
+```
+Note that the model weights are saved in FP16. This type can be changed when the model is loaded using the [`compute_type` option in CTranslate2](https://opennmt.net/CTranslate2/quantization.html).
+## More information
+**For more information about the original model, see its [model card](https://huggingface.co/openai/whisper-large-v3-turbo).**

config.json ADDED Viewed

	@@ -0,0 +1,224 @@

+{
+  "alignment_heads": [
+    [
+      2,
+      4
+    ],
+    [
+      2,
+      11
+    ],
+    [
+      3,
+      3
+    ],
+    [
+      3,
+      6
+    ],
+    [
+      3,
+      11
+    ],
+    [
+      3,
+      14
+    ]
+  ],
+  "lang_ids": [
+    50259,
+    50260,
+    50261,
+    50262,
+    50263,
+    50264,
+    50265,
+    50266,
+    50267,
+    50268,
+    50269,
+    50270,
+    50271,
+    50272,
+    50273,
+    50274,
+    50275,
+    50276,
+    50277,
+    50278,
+    50279,
+    50280,
+    50281,
+    50282,
+    50283,
+    50284,
+    50285,
+    50286,
+    50287,
+    50288,
+    50289,
+    50290,
+    50291,
+    50292,
+    50293,
+    50294,
+    50295,
+    50296,
+    50297,
+    50298,
+    50299,
+    50300,
+    50301,
+    50302,
+    50303,
+    50304,
+    50305,
+    50306,
+    50307,
+    50308,
+    50309,
+    50310,
+    50311,
+    50312,
+    50313,
+    50314,
+    50315,
+    50316,
+    50317,
+    50318,
+    50319,
+    50320,
+    50321,
+    50322,
+    50323,
+    50324,
+    50325,
+    50326,
+    50327,
+    50328,
+    50329,
+    50330,
+    50331,
+    50332,
+    50333,
+    50334,
+    50335,
+    50336,
+    50337,
+    50338,
+    50339,
+    50340,
+    50341,
+    50342,
+    50343,
+    50344,
+    50345,
+    50346,
+    50347,
+    50348,
+    50349,
+    50350,
+    50351,
+    50352,
+    50353,
+    50354,
+    50355,
+    50356,
+    50357,
+    50358
+  ],
+  "suppress_ids": [
+    1,
+    2,
+    7,
+    8,
+    9,
+    10,
+    14,
+    25,
+    26,
+    27,
+    28,
+    29,
+    31,
+    58,
+    59,
+    60,
+    61,
+    62,
+    63,
+    90,
+    91,
+    92,
+    93,
+    359,
+    503,
+    522,
+    542,
+    873,
+    893,
+    902,
+    918,
+    922,
+    931,
+    1350,
+    1853,
+    1982,
+    2460,
+    2627,
+    3246,
+    3253,
+    3268,
+    3536,
+    3846,
+    3961,
+    4183,
+    4667,
+    6585,
+    6647,
+    7273,
+    9061,
+    9383,
+    10428,
+    10929,
+    11938,
+    12033,
+    12331,
+    12562,
+    13793,
+    14157,
+    14635,
+    15265,
+    15618,
+    16553,
+    16604,
+    18362,
+    18956,
+    20075,
+    21675,
+    22520,
+    26130,
+    26161,
+    26435,
+    28279,
+    29464,
+    31650,
+    32302,
+    32470,
+    36865,
+    42863,
+    47425,
+    49870,
+    50254,
+    50258,
+    50359,
+    50360,
+    50361,
+    50362,
+    50363
+  ],
+  "suppress_ids_begin": [
+    220,
+    50257
+  ]
+}

model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e76620f83d5f5b69efd3d87e3dc180c1bd21df9fbebacfd4335e5e1efcc018da
+size 1617884929

preprocessor_config.json ADDED Viewed

	@@ -0,0 +1,14 @@

+{
+  "chunk_length": 30,
+  "feature_extractor_type": "WhisperFeatureExtractor",
+  "feature_size": 128,
+  "hop_length": 160,
+  "n_fft": 400,
+  "n_samples": 480000,
+  "nb_max_frames": 3000,
+  "padding_side": "right",
+  "padding_value": 0.0,
+  "processor_class": "WhisperProcessor",
+  "return_attention_mask": false,
+  "sampling_rate": 16000
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

vocabulary.json ADDED Viewed

The diff for this file is too large to render. See raw diff