Masioki
/

fusion_gttbsc_distilbert-uncased-best

fusion-cross-attention-sentence-classifier

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

Masioki commited on Jun 4

Commit

574e521

•

1 Parent(s): a4eed7f

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -33,10 +33,10 @@ Ground truth text with prosody encoding and ASR encoding residual cross attentio
 ## Model description
-ASR encoder: [Whisper small](https://huggingface.co/openai/whisper-small) encoder
-Prosody encoder: 2 layer transformer encoder with initial dense projection
 Backbone: [DistilBert uncased](https://huggingface.co/distilbert/distilbert-base-uncased)
-Fusion: 2 residual cross attention fusion layers (F_asr x F_text and F_prosody x F_text) with dense layer on top
 Pooling: Self attention
 Multi-label classification head: 2 dense layers with two dropouts 0.3 and Tanh activation inbetween

 ## Model description
+ASR encoder: [Whisper small](https://huggingface.co/openai/whisper-small) encoder
+Prosody encoder: 2 layer transformer encoder with initial dense projection
 Backbone: [DistilBert uncased](https://huggingface.co/distilbert/distilbert-base-uncased)
+Fusion: 2 residual cross attention fusion layers (F_asr x F_text and F_prosody x F_text) with dense layer on top
 Pooling: Self attention
 Multi-label classification head: 2 dense layers with two dropouts 0.3 and Tanh activation inbetween