autoevaluator HF staff commited on
Commit
a7197e1
1 Parent(s): df0f743

Add evaluation results on the default config and validation_matched split of multi_nli

Browse files

Beep boop, I am a bot from Hugging Face's automatic model evaluator 👋!\
Your model has been evaluated on the default config and validation_matched split of the [multi_nli](https://huggingface.co/datasets/multi_nli) dataset by

@kslnet

, using the predictions stored [here](https://huggingface.co/datasets/autoevaluate/autoeval-eval-multi_nli-default-544a62-53715145359).\
Accept this pull request to see the results displayed on the [Hub leaderboard](https://huggingface.co/spaces/autoevaluate/leaderboards?dataset=multi_nli).\
Evaluate your model on more datasets [here](https://huggingface.co/spaces/autoevaluate/model-evaluator?dataset=multi_nli).

Files changed (1) hide show
  1. README.md +67 -0
README.md CHANGED
@@ -8,6 +8,73 @@ datasets:
8
  - multi_nli
9
  - wikipedia
10
  - bookcorpus
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  ---
12
 
13
  # roberta-large-mnli
 
8
  - multi_nli
9
  - wikipedia
10
  - bookcorpus
11
+ model-index:
12
+ - name: roberta-large-mnli
13
+ results:
14
+ - task:
15
+ type: natural-language-inference
16
+ name: Natural Language Inference
17
+ dataset:
18
+ name: multi_nli
19
+ type: multi_nli
20
+ config: default
21
+ split: validation_matched
22
+ metrics:
23
+ - type: accuracy
24
+ value: 0.9059602649006623
25
+ name: Accuracy
26
+ verified: true
27
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMDc2ZDk0NDM5YmNkYTJlMGFiNjVjYjJjNmQ0YTQ2M2ExNTA2NDhmMDA1MWU0YzY2YmNmNjJlM2QwODI2Zjc2OSIsInZlcnNpb24iOjF9.vipeBFdoRHhd43kGJ7dtgjBRCugxCgd2-FWjgtsyTVH9hRdwau4IcWVN0Tw1ybwKxSjHYIJtX9-ngofK6sFWCg
28
+ - type: f1
29
+ value: 0.9051030334294846
30
+ name: F1 Macro
31
+ verified: true
32
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNTZhZWY1ZWEyZjY5NmExOTBjZGU4ZDgwOWE0NThiNTBkNTZhNmUwMGU5MmVjODZlZjk0ODhmOTlkZGUzMmIyNSIsInZlcnNpb24iOjF9.i-Q2k0LVK8K1wPZdnsWYaUU8MpIYaHJtn7DyLc_KpTy98RPGJ4y-sZMMaY57RLeSTnaFK779eGqGu95Fnlv0BA
33
+ - type: f1
34
+ value: 0.9059602649006623
35
+ name: F1 Micro
36
+ verified: true
37
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNzUxMDgzMGZhMmZlMjgyMjgxNDBjZWYzOTBmYWRlMzFiOTg0YzgzYzYyNzI2OGMxMTkzNmM1M2IyNzgzMzkyYyIsInZlcnNpb24iOjF9.7mo7aWPeBjcJF2A4C4k3Y0u5Y0tmHvCQJSxi59Dc3Jx7i613VDB95_iHatXAovfe7vNE9uN0QG7Q4BHNQnFrAg
38
+ - type: f1
39
+ value: 0.9057702464648203
40
+ name: F1 Weighted
41
+ verified: true
42
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMTc1NDc1N2I2N2NjMjI5NmIwMWUwNzMxZDQ3MjMwYzE3YWM3NjY0N2RiN2ViZTIyMjA0ZTFjZWNkMWVmMmRhYiIsInZlcnNpb24iOjF9.kgu7F32wG957pLqDU_d5Mbq8SlywzgrLMmxEcVlH5sLelvUcNCUVkD-qUTDDVjbvrwf8O3wHlaHzAGxgRKz-CQ
43
+ - type: precision
44
+ value: 0.9051381734323508
45
+ name: Precision Macro
46
+ verified: true
47
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNjIyYzIwOWIwOGFhZGU1ZjA2ZjRiYmZkYTY0YWVmNzIwNDdhMzE0ZjBjZWJhZjE0NjZhZjg2YWZhYzA5MjI0OSIsInZlcnNpb24iOjF9.i_EfBOn9_ns2hTOXPfB9yEWYj45DEsleGA0IY0k9C8CY6S8heuINKtFQba_SpPblQMro93TOBYF-iQnHPUD0CA
48
+ - type: precision
49
+ value: 0.9059602649006623
50
+ name: Precision Micro
51
+ verified: true
52
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiM2UxODE1YzFhZmVlM2Q1YjQ2MmNlZDFlYzQyZjQ4YzJiOTk2YTA2YzRhZjM1ZDUzYTA0NzRlMzk1NzVmODU5YSIsInZlcnNpb24iOjF9.EaQyZ1n1_gLwcmXwHpWe6laJhaZ_dEIbXDDeAMuTKvED1A_dwdsjsfAQ3JbEgV56kgMtcbeGer9339ocqLEIBQ
53
+ - type: precision
54
+ value: 0.9056708619045606
55
+ name: Precision Weighted
56
+ verified: true
57
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiYjFhN2QwZGJlYWMyOTQ4MzAzYTFiMTcwNjY2NGQxNjkwNzI3ZDc4MmY4Y2Y2ZmMzMDQwYzYzMGI3YjUyYzJiMiIsInZlcnNpb24iOjF9.iWiXehYIb3AKBLUCR6lVdoCoANwyjNb1uZvtxHddFYvLUIwQzBGaAH-_S1pRaEjEZpoa4tarCj5cmw2KTSEwBA
58
+ - type: recall
59
+ value: 0.905161576355605
60
+ name: Recall Macro
61
+ verified: true
62
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZWNiOTc3YmFjODEzZTZmY2ZhZjliNDNkMjBmNGUwNmFjNWRjZmUyYWViZTVmNDQ2N2U0ZDgzODY3ODdmY2U3OSIsInZlcnNpb24iOjF9.hg_oj1175LM3r9WyhuBL4p8kjEaWvZLPH16LEo9qa18PWBIxD1qPmzcB-ADlT1l9yAw7f7MFYdd0WFawEVB_Bw
63
+ - type: recall
64
+ value: 0.9059602649006623
65
+ name: Recall Micro
66
+ verified: true
67
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMzhiNTQzYTE1NmQyZjkzYmRjMWE4ZDVkZjFmODliYTU0NDQ0MWUyYTIyNDFiMTY5ZmYxMGFkMmIyYTIyNDZjMCIsInZlcnNpb24iOjF9.AY7djK19OtjQf1ZlLqTObg71Jskmb_5vkMXqB31Pq-Qg1YXu8uHn6-b7nDMSHA8xcoWbBEvPwPxQnnpQ8-hIDw
68
+ - type: recall
69
+ value: 0.9059602649006623
70
+ name: Recall Weighted
71
+ verified: true
72
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiY2U4NDQxM2E1ODgyOTY1ZDc2ODE4NzNmZDEwMTVkNTYzZTQ3M2EzMDRlNmE0Mjc5ZjIzZWQ1YjRlNTFlMGQ0MiIsInZlcnNpb24iOjF9.qN-tG13P4Cuw42nT3zBm4ox7CPrP7ShPXli0Jtf7-ycGD0NIYkHbPqoXgtIawrl-KD8wu8HqEniAt5kjbXjDDA
73
+ - type: loss
74
+ value: 0.28174877166748047
75
+ name: loss
76
+ verified: true
77
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiN2UyMzdlODViNTBjMmJiNmIzMDdjZDg1MjgxZWU4NGYwM2U4ZmJlN2U5ZmVhYTdlMDExZWExY2IyOWViZjE1NiIsInZlcnNpb24iOjF9.FkiwiKl2c8KpGYlP-xtnXumzoOGL_Y8XJQ_ScpXhS8slztLzjYNESo9TXHzb_-_YO-o3RN84pBGpOqEPDm4UBw
78
  ---
79
 
80
  # roberta-large-mnli