ydshieh commited on
Commit
5c4c715
1 Parent(s): c0cae7b

improve desc

Browse files
Files changed (1) hide show
  1. app.py +1 -0
app.py CHANGED
@@ -11,6 +11,7 @@ st.sidebar.markdown(
11
  """
12
  An image caption model [ViT-GPT2](https://huggingface.co/flax-community/vit-gpt2/tree/main) by combining the ViT model and a French GPT2 model.
13
  [Part of the [Huggingface JAX/Flax event](https://discuss.huggingface.co/t/open-to-the-community-community-week-using-jax-flax-for-nlp-cv/).]\n
 
14
  The pretained weights of both models are loaded, with a set of randomly initialized cross-attention weigths.
15
  The model is trained on 65000 images from the COCO dataset for about 1500 steps, with the original english cpationis are translated to french for training purpose.
16
  """
 
11
  """
12
  An image caption model [ViT-GPT2](https://huggingface.co/flax-community/vit-gpt2/tree/main) by combining the ViT model and a French GPT2 model.
13
  [Part of the [Huggingface JAX/Flax event](https://discuss.huggingface.co/t/open-to-the-community-community-week-using-jax-flax-for-nlp-cv/).]\n
14
+ The GPT2 model source code is modified so it can accept an encoder's output.
15
  The pretained weights of both models are loaded, with a set of randomly initialized cross-attention weigths.
16
  The model is trained on 65000 images from the COCO dataset for about 1500 steps, with the original english cpationis are translated to french for training purpose.
17
  """