Submit to leaderboard

#1
by distantquant - opened

Would be cool to see its scores.

Already submitted, waiting for evaluation.

By the way, it's nice that you've added some of my models to your own top list, but adding models without testing them first yourself is not a wise choise. How can you know for sure without trying them? What if the benchmarks were gamed and performance in practice is shit? What if someone recreated phi-CTNL-1M again?

By the way, it's nice that you've added some of my models to your own top list, but adding models without testing them first yourself is not a wise choise. How can you know for sure without trying them? What if the benchmarks were gamed and performance in practice is shit? What if someone recreated phi-CTNL-1M again?

Don't take my top model list too seriously

I don't have the inference capability for 70B yet personally so I have to base judgements off of benchmarks and others interactions with models

( ͡°╭͜ʖ╮ ͡°)

distantquant changed discussion status to closed

Sign up or log in to comment