rahular commited on
Commit
58c9c79
1 Parent(s): 0a36f6d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -26,7 +26,7 @@ language:
26
  Varta-T5 is a model pre-trained on the `full` training set of [Varta](https://huggingface.co/datasets/rahular/varta) in 14 Indic languages (Assamese, Bhojpuri, Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Nepali, Oriya, Punjabi, Tamil, Telugu, and Urdu) and English, using span corruption and gap-sentence generation as objectives.
27
 
28
  [Varta](https://huggingface.co/datasets/rahular/varta) is a large-scale news corpus for Indic languages, including 41.8 million news articles in 14 different Indic languages (and English), which come from a variety of high-quality sources.
29
- The dataset and the model are introduced in [this paper](https://arxiv.org/abs/2305.05858). The code is released in [this repository](https://github.com/rahular/varta). The data is released in [this bucket](https://console.cloud.google.com/storage/browser/varta-eu/data-release).
30
 
31
  ## Uses
32
  You can use this model for causal language modeling, but it's mostly intended to be fine-tuned on a downstream task.
 
26
  Varta-T5 is a model pre-trained on the `full` training set of [Varta](https://huggingface.co/datasets/rahular/varta) in 14 Indic languages (Assamese, Bhojpuri, Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Nepali, Oriya, Punjabi, Tamil, Telugu, and Urdu) and English, using span corruption and gap-sentence generation as objectives.
27
 
28
  [Varta](https://huggingface.co/datasets/rahular/varta) is a large-scale news corpus for Indic languages, including 41.8 million news articles in 14 different Indic languages (and English), which come from a variety of high-quality sources.
29
+ The dataset and the model are introduced in [this paper](https://arxiv.org/abs/2305.05858). The code is released in [this repository](https://github.com/rahular/varta).
30
 
31
  ## Uses
32
  You can use this model for causal language modeling, but it's mostly intended to be fine-tuned on a downstream task.