Update README.md
Browse files
README.md
CHANGED
@@ -4,4 +4,4 @@ Part of the [Huggingface JAX/Flax event](https://discuss.huggingface.co/t/open-t
|
|
4 |
|
5 |
The GPT2 model source code is modified so it can accept an encoder's output.
|
6 |
The pretained weights of both models are loaded, with a set of randomly initialized cross-attention weigths.
|
7 |
-
The model is trained on 65000 images from the COCO dataset for about 1500 steps, with the original english cpationis
|
|
|
4 |
|
5 |
The GPT2 model source code is modified so it can accept an encoder's output.
|
6 |
The pretained weights of both models are loaded, with a set of randomly initialized cross-attention weigths.
|
7 |
+
The model is trained on 65000 images from the COCO dataset for about 1500 steps (batch\_size=256), with the original english cpationis being translated to french for training purpose.
|