Cnam-LMSSC
/

EBEN_throat_microphone

Inference Endpoints

Model card Files Files and versions Community

jhauret commited on Jun 18

Commit

60b8d6a

•

1 Parent(s): 8d90504

Upload EBENGenerator

Files changed (3) hide show

README.md +50 -0
config.json +5 -0
model.safetensors +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,50 @@

+---
+language: fr
+license: mit
+library_name: transformers
+tags:
+- audio
+- audio-to-audio
+- speech
+datasets:
+- Cnam-LMSSC/vibravox
+---
+# Model Card
+- **Developed by:** [Cnam-LMSSC](https://huggingface.co/Cnam-LMSSC)
+- **Model type:** [EBEN](https://github.com/jhauret/vibravox/blob/main/vibravox/torch_modules/dnn/eben_generator.py) (see [publication](https://ieeexplore.ieee.org/document/10244161))
+- **Language:** French
+- **License:** MIT
+- **Finetuned dataset:** `speech_clean` subset of [Cnam-LMSSC/vibravox](https://huggingface.co/datasets/Cnam-LMSSC/vibravox)
+- **Samplerate for usage:** 16kHz
+## Overview
+This bandwidth extension model is trained on one specific body conduction sensor data from the [Vibravox dataset](https://huggingface.co/datasets/Cnam-LMSSC/vibravox).
+The model is designed to to enhance the audio quality of body-conducted captured speech, by denoising and regenerating mid and high frequencies from low frequency content only.
+## Disclaimer
+This model has been trained for **specific non-conventional speech sensors** and is intended to be used with **in-domain data**.
+Please be advised that using these models outside their intended sensor data may result in suboptimal performance.
+## Training procedure
+Detailed instructions for reproducing the experiments are available on the [jhauret/vibravox](https://github.com/jhauret/vibravox) Github repository.
+## Inference script :
+```python
+import torch, torchaudio
+from vibravox import EBENGenerator
+from datasets import load_dataset
+audio_16kHz, _ = torch.load("path_to_audio")
+cut_audio_16kHz = model.cut_to_valid_length(audio_16kHz)
+enhanced_audio_16kHz = model(cut_audio_16kHz)
+```
+## Link to other BWE models trained on other body conducted sensors :
+An entry point to all **audio bandwidth extension** (BWE) models trained on different sensor data from the trained on different sensor data from the [Vibravox dataset](https://huggingface.co/datasets/Cnam-LMSSC/vibravox) is available at [https://huggingface.co/Cnam-LMSSC/vibravox_EBEN_bwe_models](https://huggingface.co/Cnam-LMSSC/vibravox_EBEN_bwe_models).

config.json ADDED Viewed

	@@ -0,0 +1,5 @@

+{
+  "m": 4,
+  "n": 32,
+  "p": 2
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1bb60a40d2da1da028c190605251fce2ec1c45a6f4264020d639725c16cefbac
+size 7797832