What about latencies

#3
by LorenzoCevolaniAXA - opened

do you have a benchmark for the full mixtral on 48xlarge vs the medusa modified mixtral awq here on the 12xlarge?

Sign up or log in to comment