does not return hidden states
here's a modified version that returns attention and hidden states https://huggingface.co/wassname/phi-2-GPTQ_w_hidden_states/blob/main/configuration_phi.py
@wassname
is there any plan to really change phi-2 ?
Because the following warning remains on the main page :
"Remark: In the generation function, our model currently does not support beam search (num_beams > 1). Furthermore, in the forward pass of the model, we currently do not support outputting hidden states or attention values, or using custom input embeddings."
I personally use a lot custom input embeddings and this makes phi unusable for many usecases in my opinion.
Thanks Gustov, much appreciated. Phi -2 is an awesome model for research as it fits on consumer gpu's even when doing strange experiments (VAE, Adaptors, Probing).
Amazing ! Im impatient
It looks like they did fix it, thanks to whoever did that :)
Yeyy, glad to hear it, as Gemma still isnt so great compared to Phi-2 xD