Evaluation results for ibm/ColD-Fusion-bert-base-uncased-itr22-seed0 model as a base model for other tasks
#1
by
eladven
- opened
README.md
CHANGED
@@ -51,6 +51,20 @@ output = model(encoded_input)
|
|
51 |
```
|
52 |
|
53 |
## Evaluation results
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
54 |
See full evaluation results of this model and many more [here](https://ibm.github.io/model-recycling/roberta-base_table.html)
|
55 |
When fine-tuned on downstream tasks, this model achieves the following results:
|
56 |
|
|
|
51 |
```
|
52 |
|
53 |
## Evaluation results
|
54 |
+
|
55 |
+
## Model Recycling
|
56 |
+
|
57 |
+
[Evaluation on 36 datasets](https://ibm.github.io/model-recycling/model_gain_chart?avg=3.25&mnli_lp=nan&20_newsgroup=2.21&ag_news=-0.79&amazon_reviews_multi=0.34&anli=0.55&boolq=5.26&cb=14.20&cola=-0.43&copa=9.55&dbpedia=0.37&esnli=0.94&financial_phrasebank=15.47&imdb=0.50&isear=0.68&mnli=0.68&mrpc=4.04&multirc=0.80&poem_sentiment=16.01&qnli=-0.48&qqp=0.09&rotten_tomatoes=4.83&rte=18.00&sst2=1.72&sst_5bins=3.09&stsb=3.07&trec_coarse=1.14&trec_fine=12.67&tweet_ev_emoji=-0.12&tweet_ev_emotion=2.07&tweet_ev_hate=-1.57&tweet_ev_irony=1.50&tweet_ev_offensive=-0.02&tweet_ev_sentiment=-0.06&wic=2.58&wnli=-1.27&wsc=0.38&yahoo_answers=-1.08&model_name=ibm%2FColD-Fusion-bert-base-uncased-itr22-seed0&base_name=bert-base-uncased) using ibm/ColD-Fusion-bert-base-uncased-itr22-seed0 as a base model yields average score of 75.45 in comparison to 72.20 by bert-base-uncased.
|
58 |
+
|
59 |
+
The model is ranked 3rd among all tested models for the bert-base-uncased architecture as of 09/01/2023
|
60 |
+
Results:
|
61 |
+
|
62 |
+
| 20_newsgroup | ag_news | amazon_reviews_multi | anli | boolq | cb | cola | copa | dbpedia | esnli | financial_phrasebank | imdb | isear | mnli | mrpc | multirc | poem_sentiment | qnli | qqp | rotten_tomatoes | rte | sst2 | sst_5bins | stsb | trec_coarse | trec_fine | tweet_ev_emoji | tweet_ev_emotion | tweet_ev_hate | tweet_ev_irony | tweet_ev_offensive | tweet_ev_sentiment | wic | wnli | wsc | yahoo_answers |
|
63 |
+
|---------------:|----------:|-----------------------:|-------:|--------:|--------:|--------:|-------:|----------:|--------:|-----------------------:|-------:|--------:|--------:|--------:|----------:|-----------------:|--------:|--------:|------------------:|--------:|--------:|------------:|--------:|--------------:|------------:|-----------------:|-------------------:|----------------:|-----------------:|---------------------:|---------------------:|--------:|--------:|------:|----------------:|
|
64 |
+
| 85.2629 | 88.8 | 66.26 | 47.5 | 74.2202 | 78.5714 | 81.3998 | 59 | 78.5333 | 90.6454 | 84 | 92.072 | 69.7523 | 84.4081 | 86.0294 | 60.7673 | 82.6923 | 89.4014 | 90.3661 | 89.6811 | 77.9783 | 93.6927 | 55.8824 | 88.9308 | 97.2 | 81 | 35.884 | 81.9845 | 51.2795 | 69.2602 | 85.3488 | 69.4155 | 65.8307 | 49.2958 | 62.5 | 71.2333 |
|
65 |
+
|
66 |
+
|
67 |
+
For more information, see: [Model Recycling](https://ibm.github.io/model-recycling/)
|
68 |
See full evaluation results of this model and many more [here](https://ibm.github.io/model-recycling/roberta-base_table.html)
|
69 |
When fine-tuned on downstream tasks, this model achieves the following results:
|
70 |
|