Change Flax GPT2 with cross-attn layers to be the same as PyTorch's version 7d29388 ydshieh commited on Aug 4, 2021
Change Flax GPT2 with cross-attn outputs to be the same as PyTorch's version 165ad1e ydshieh commited on Aug 4, 2021