--- license: mit tags: - code - audio - acceleration - network --- ## Overview This is an implementation of whisper from scratch in C++. The related binaries are available on HuggingFace. This is a proof-of-concept. Further modifications, imporvements are coming. Feedbacks are wellcomed in the corresponding github repository, [precompAId](https://github.com/anycores/precompAId). Binary contains: * exe for testing the app quickly * header and dll for building custom solutions * main.cpp as an example, how to use the header (the exe compiled from this) * weights.xdf (required to load into the graph, no other input required) * audios folder, containing examples to try the application * convert.py for creating the right input for the application from and arbitrary audio file ## Quick start Example for the usage of whisper.exe: ``` whisper.exe weights.xgdf audios\voice_example1.pb ``` Example compilation (with clang from the root): ``` clang++ main.cpp win64\whisper.lib -o whisper.exe ``` Example for converting: ``` python convert.py --ipath audios\voice_example_orig1.wav --opath voice_example.pb ``` ## Implementation info Tested on: * windows 10 * intel i7 11th gen * clang 16.06 as compiler Current properties: * fp32 * avx512 is required ## Further Notes Improved versions will arrive regularly. Feedbacks are wellcomed. Especially the following: * features to be add (input format, expected output format etc.) * devices (plan to extend for mobiles, IPUs etc.) * models (what other models would be great to accelerate)