hamishivi
/

hypertask_T0_11B

Text2Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

hamishivi commited on May 17, 2023

Commit

59ae7ef

•

1 Parent(s): 25372d4

fix google t5 link

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -6,7 +6,7 @@ language:
 ---
 An 11B T5 model trained on the [P3](https://huggingface.co/datasets/bigscience/P3) (T0 split) dataset for 20,000 steps with a batch size of 2048 a maximum input sequence length of 1024, a maximum output sequence length of 256, and the Adafactor optimizer with a constant learning rate of 0.001.
-The model is trained from the [T5 v1.1 lm-adapt checkpoint](https://huggingface.co/google/t5-xl-lm-adapt) and fully finetuned.
 For more details, see [HINT: Hypernetwork Instruction Tuning for Efficient Zero- & Few-Shot Generalisation](https://arxiv.org/abs/2212.10315).

 ---
 An 11B T5 model trained on the [P3](https://huggingface.co/datasets/bigscience/P3) (T0 split) dataset for 20,000 steps with a batch size of 2048 a maximum input sequence length of 1024, a maximum output sequence length of 256, and the Adafactor optimizer with a constant learning rate of 0.001.
+The model is trained from the [T5 v1.1 lm-adapt checkpoint](https://huggingface.co/google/t5-xxl-lm-adapt) and fully finetuned.
 For more details, see [HINT: Hypernetwork Instruction Tuning for Efficient Zero- & Few-Shot Generalisation](https://arxiv.org/abs/2212.10315).