skypro1111 commited on
Commit
eb01b22
1 Parent(s): 9df5160

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -6,13 +6,14 @@ tags: []
6
  # Model Card for mbart-large-50-verbalization
7
 
8
  ## Model Description
9
- `mbart-large-50-verbalization` is a fine-tuned version of the `mbart-large-50` model, specifically designed for the task of verbalizing Ukrainian text to prepare it for Text-to-Speech (TTS) systems. This model aims to transform structured data like numbers and dates into their fully expanded textual representations in Ukrainian.
10
 
11
  ## Architecture
12
- This model is based on the `mbart-large-50` architecture, renowned for its effectiveness in translation and text generation tasks across numerous languages.
13
 
14
  ## Training Data
15
  The model was fine-tuned on a subset of 96,780 sentences from the Ubertext dataset, focusing on news content. The verbalized equivalents were created using Google Gemini Pro, providing a rich basis for learning text transformation tasks.
 
16
 
17
  ## Training Procedure
18
  The model underwent nearly 70,000 training steps, amounting to almost 2 epochs, to ensure thorough learning from the training dataset.
 
6
  # Model Card for mbart-large-50-verbalization
7
 
8
  ## Model Description
9
+ `mbart-large-50-verbalization` is a fine-tuned version of the [facebook/mbart-large-50](https://huggingface.co/facebook/mbart-large-50) model, specifically designed for the task of verbalizing Ukrainian text to prepare it for Text-to-Speech (TTS) systems. This model aims to transform structured data like numbers and dates into their fully expanded textual representations in Ukrainian.
10
 
11
  ## Architecture
12
+ This model is based on the [facebook/mbart-large-50](https://huggingface.co/facebook/mbart-large-50) architecture, renowned for its effectiveness in translation and text generation tasks across numerous languages.
13
 
14
  ## Training Data
15
  The model was fine-tuned on a subset of 96,780 sentences from the Ubertext dataset, focusing on news content. The verbalized equivalents were created using Google Gemini Pro, providing a rich basis for learning text transformation tasks.
16
+ Dataset [skypro1111/ubertext-2-news-verbalized](https://huggingface.co/datasets/skypro1111/ubertext-2-news-verbalized)
17
 
18
  ## Training Procedure
19
  The model underwent nearly 70,000 training steps, amounting to almost 2 epochs, to ensure thorough learning from the training dataset.