slseanwu commited on
Commit
ca4f21e
1 Parent(s): 0f1ea92

add readme; gh repo to be added

Browse files
Files changed (1) hide show
  1. README.md +28 -0
README.md CHANGED
@@ -1,3 +1,31 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ language:
4
+ - en
5
+ library_name: transformers
6
+ tags:
7
+ - audio-captioning
8
+ - audiocaps
9
+ - clotho
10
+ - dcase-challenge
11
+ - icassp-24
12
  ---
13
+ ## Summary
14
+ This repo contains the config & pretrained weights of the model described in the following paper:
15
+ - **Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation**
16
+ Shih-Lun Wu, Xuankai Chang, Gordon Wichern, Jee-weon Jung, François Germain, Jonathan Le Roux, and Shinji Watanabe
17
+ Int. Conf. on Acoustics, Speech, and Signal Processing (**ICASSP**) 2024
18
+ [[arXiv page](https://arxiv.org/abs/2309.17352)]
19
+ ## GitHub Repository
20
+ To use this model, please refer to our code published at:
21
+ - TBA
22
+ ## BibTex
23
+ If you find our model useful, please consider citing our paper. Thanks!
24
+ ```
25
+ @inproceedings{wu2024improving,
26
+ title={Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation},
27
+ author={Wu, Shih-Lun and Chang, Xuankai and Wichern, Gordon and Jung, Jee-weon and Germain, Fran{\c{c}}ois and Le Roux, Jonathan and Watanabe, Shinji},
28
+ booktitle={Proc. Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP)},
29
+ year={2024}
30
+ }
31
+ ```