Can someone rate this personally?
#6
by
Noxi-V
- opened
I feel like ever since reflection, I do not trust benchmarks since they are practically can be trained and cheesed
anyone tried this? How good is it in a real use case?
Not an answer, but this model's training is completely transparent/traceable.
- As per the model card, it was finetuned off a specific base model (which used to be #1 on that leaderboard) on a small sample of a specific dataset. Both are listed in the model card.
Curious to hear how it is working on real use cases as well!