kamilakesbi commited on
Commit
83407f3
1 Parent(s): 117475e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -10
README.md CHANGED
@@ -7,28 +7,26 @@ sdk: static
7
  pinned: false
8
  ---
9
 
10
- [diarizers-community](https://huggingface.co/diarizers-community) aims to promote speaker diarization on the Hugging Face hub. It comes with [diarizers](https://github.com/kamilakesbi/diarizers), a library for fine-tuning pyannote speaker diarzaition models that is compatible with the Hugging Face ecosystem.
11
 
12
- This organization contains:
13
 
14
- - A collection of [multilingual speaker diarization datasets](https://huggingface.co/collections/diarizers-community/speaker-diarization-datasets-66261b8d571552066e003788) that are compatible with diarizers. They have been processed using [diarizers scripts](https://github.com/kamilakesbi/diarizers/blob/main/datasets/README.md).
15
-
16
- The currently available datasets are the CallHome (Japanese, Chinese, German, Spanish, English), the AMI Corpus (English), Vox-Converse (English) and Simsamu (French). We aim at adding more datasets in the future to support speaker diarization on the Hub.
17
 
18
  - A collection of [5 fine-tuned segmentation model](https://huggingface.co/collections/diarizers-community/models-66261d0f9277b825c807ff2a) baselines that can be used in a pyannote speaker diarization pipeline.
19
 
20
- - Each model has been fine-tuned on a specific language of the Callhome dataset. Compared to the pre-trained pyannote [segmentation model](https://huggingface.co/pyannote/segmentation-3.0), they obtain better performances on each language:
21
 
22
 
23
  ** ADD BENCHMARK **
24
 
25
- Note: Results have been obtained using the `test_segmentation.py` script from diarizers.
26
 
27
- Together with this organisation, we release:
28
 
29
- - The diarizers library, to fine-tune pyannote segmentation models and use them back in a pyannote speaker diarization pipeline.
30
 
31
- - A google colab [notebook](https://colab.research.google.com/github/kamilakesbi/notebooks/blob/main/fine_tune_pyannote.ipynb), whith a step-by-step guide on how to use diarizers.
32
 
33
 
34
  Edit this `README.md` markdown file to author your organization card.
 
7
  pinned: false
8
  ---
9
 
10
+ [diarizers-community](https://huggingface.co/diarizers-community) aims to promote speaker diarization on the Hugging Face hub. It contains:
11
 
12
+ - A collection of [multilingual speaker diarization datasets](https://huggingface.co/collections/diarizers-community/speaker-diarization-datasets-66261b8d571552066e003788) that are compatible with the [diarizers](https://github.com/kamilakesbi/diarizers) library. They have been processed using [diarizers scripts](https://github.com/kamilakesbi/diarizers/blob/main/datasets/README.md).
13
 
14
+ The currently available datasets are the CallHome (Japanese, Chinese, German, Spanish, English), the AMI Corpus (English), Vox-Converse (English) and Simsamu (French). We aim to add more datasets in the future to better support speaker diarising on the Hub.
 
 
15
 
16
  - A collection of [5 fine-tuned segmentation model](https://huggingface.co/collections/diarizers-community/models-66261d0f9277b825c807ff2a) baselines that can be used in a pyannote speaker diarization pipeline.
17
 
18
+ Each model has been fine-tuned on a specific language of the Callhome dataset. Compared to the pre-trained pyannote [segmentation model](https://huggingface.co/pyannote/segmentation-3.0), they achieve better performances on multlingual data:
19
 
20
 
21
  ** ADD BENCHMARK **
22
 
23
+ Note: Results have been obtained using [test scripts](https://github.com/kamilakesbi/diarizers/blob/main/test_segmentation.py) from diarizers.
24
 
25
+ diarizers-community comes with:
26
 
27
+ - [diarizers](https://github.com/kamilakesbi/diarizers/tree/main)a library for fine-tuning pyannote speaker diarization models using the Hugging Face ecosystem. It can be used to improve performance on both English and multilingual diarization datasets with simple example scripts, with as little as ten hours of labelled diarization data and just 5 minutes of GPU compute time.
28
 
29
+ - A google colab [notebook](https://colab.research.google.com/github/kamilakesbi/notebooks/blob/main/fine_tune_pyannote.ipynb), whith a step-by-step guide on how to use diarizers for fine-tunning pyannote segmentation model.
30
 
31
 
32
  Edit this `README.md` markdown file to author your organization card.