Evaluation results for MoritzLaurer/DeBERTa-v3-base-mnli model as a base model for other tasks

As part of a research effort to identify high quality models in Huggingface that can serve as base models for further finetuning, we evaluated this by finetuning on 36 datasets. The model ranks 1st among all tested models for the microsoft/deberta-v3-base architecture as of 09/01/2023.

To share this information with others in your model card, please add the following evaluation results to your README.md page.

For more information please see https://ibm.github.io/model-recycling/ or contact me.

Best regards,
Elad Venezian
[email protected]
IBM Research AI

Files changed (1) hide show

README.md +14 -1

README.md CHANGED Viewed

@@ -58,4 +58,17 @@ If you want to cite this model, please cite the original DeBERTa paper, the resp
 If you have questions or ideas for cooperation, contact me at m{dot}laurer{at}vu{dot}nl or [LinkedIn](https://www.linkedin.com/in/moritz-laurer/)
 ### Debugging and issues
-Note that DeBERTa-v3 was released recently and older versions of HF Transformers seem to have issues running the model (e.g. resulting in an issue with the tokenizer). Using Transformers==4.13 might solve some issues.

 If you have questions or ideas for cooperation, contact me at m{dot}laurer{at}vu{dot}nl or [LinkedIn](https://www.linkedin.com/in/moritz-laurer/)
 ### Debugging and issues
+Note that DeBERTa-v3 was released recently and older versions of HF Transformers seem to have issues running the model (e.g. resulting in an issue with the tokenizer). Using Transformers==4.13 might solve some issues.
+## Model Recycling
+[Evaluation on 36 datasets](https://ibm.github.io/model-recycling/model_gain_chart?avg=0.97&mnli_lp=nan&20_newsgroup=-0.39&ag_news=0.19&amazon_reviews_multi=0.10&anli=1.31&boolq=0.81&cb=8.93&cola=0.01&copa=13.60&dbpedia=-0.23&esnli=-0.51&financial_phrasebank=0.61&imdb=-0.26&isear=-0.35&mnli=-0.34&mrpc=1.24&multirc=1.50&poem_sentiment=-0.19&qnli=0.30&qqp=0.13&rotten_tomatoes=-0.55&rte=3.57&sst2=0.35&sst_5bins=0.39&stsb=1.10&trec_coarse=-0.36&trec_fine=-0.02&tweet_ev_emoji=1.11&tweet_ev_emotion=-0.35&tweet_ev_hate=1.43&tweet_ev_irony=-2.65&tweet_ev_offensive=-1.69&tweet_ev_sentiment=-1.51&wic=0.57&wnli=-2.61&wsc=9.95&yahoo_answers=-0.33&model_name=MoritzLaurer%2FDeBERTa-v3-base-mnli&base_name=microsoft%2Fdeberta-v3-base) using MoritzLaurer/DeBERTa-v3-base-mnli as a base model yields average score of 80.01 in comparison to 79.04 by microsoft/deberta-v3-base.
+The model is ranked 1st among all tested models for the microsoft/deberta-v3-base architecture as of 09/01/2023
+Results:
+|   20_newsgroup |   ag_news |   amazon_reviews_multi |    anli |   boolq |      cb |    cola |   copa |   dbpedia |   esnli |   financial_phrasebank |   imdb |   isear |    mnli |    mrpc |   multirc |   poem_sentiment |    qnli |     qqp |   rotten_tomatoes |     rte |    sst2 |   sst_5bins |   stsb |   trec_coarse |   trec_fine |   tweet_ev_emoji |   tweet_ev_emotion |   tweet_ev_hate |   tweet_ev_irony |   tweet_ev_offensive |   tweet_ev_sentiment |     wic |    wnli |     wsc |   yahoo_answers |
+|---------------:|----------:|-----------------------:|--------:|--------:|--------:|--------:|-------:|----------:|--------:|-----------------------:|-------:|--------:|--------:|--------:|----------:|-----------------:|--------:|--------:|------------------:|--------:|--------:|------------:|-------:|--------------:|------------:|-----------------:|-------------------:|----------------:|-----------------:|---------------------:|---------------------:|--------:|--------:|--------:|----------------:|
+|        86.0196 |   90.6333 |                  66.96 | 60.0938 |  83.792 | 83.9286 | 86.5772 |     72 |      79.2 |  91.419 |                   85.1 | 94.232 | 71.5124 | 89.4426 | 90.4412 |   63.7583 |          86.5385 | 93.8129 | 91.9144 |           89.8687 | 85.9206 | 95.4128 |     57.3756 | 91.377 |          97.4 |          91 |           47.302 |            83.6031 |         57.6431 |          77.1684 |              83.3721 |              70.2947 | 71.7868 | 67.6056 | 74.0385 |            71.7 |
+For more information, see: [Model Recycling](https://ibm.github.io/model-recycling/)