How important is the grouped_topk?

#6
by dzhulgakov - opened

Hi,

I ran the model with just regular top_k softmax and got pretty sensible results. How important is it to include the hierchical grouped top_k?

dzhulgakov changed discussion status to closed

Sign up or log in to comment