Spaces:
Runtime error
Runtime error
gorkaartola
commited on
Commit
•
7e0ba0e
1
Parent(s):
1480616
Upload README.md
Browse files
README.md
CHANGED
@@ -21,36 +21,37 @@ This metric is specially designed to measure the performance of sentence classif
|
|
21 |
In addition to the classical *predictions* and *references* inputs, this metric includes a *kwarg* named *prediction_strategies (list(str))*, that refer to a family of prediction strategies that the metric can handle.
|
22 |
|
23 |
Add *predictions*, *references* and *prediction_strategies* as follows:
|
24 |
-
|
25 |
metric = evaluate.load(metric_selector)
|
26 |
metric.add_batch(predictions = predictions, references = references)
|
27 |
results = metric.compute(prediction_strategies = prediction_strategies)
|
|
|
28 |
|
29 |
-
The
|
30 |
-
|
31 |
-
|
32 |
-
|
33 |
-
|
34 |
-
The minimum fields required by this metric for the test datasets are the following (not necessarily with these names):
|
35 |
-
- *title* containing the first sentence to be compared with different queries representing each class.
|
36 |
-
- *label_ids* containing the *id* of the class the sample refers to. Including samples of all the classes is advised.
|
37 |
-
- *nli_label* which is '0' if the sample represents a True Positive or '2' if the sample represents a False Positive, meaning that the *label_ids* is incorrectly assigned to the *title*. Including both True Positive and False Positive samples for all classes is advised.
|
38 |
|
39 |
-
|
40 |
-
|
41 |
-
|
42 |
-
|
43 |
-
|
44 |
-
|Tuple-based semantic and structural mapping for a sustainable interoperability | 16 | 2 |
|
45 |
|
46 |
### Inputs
|
47 |
|
48 |
- *predictions*, *(numpy.array(float32)[sentences to classify,number of classes])*: numpy array with the softmax logits values of the entailment dimension of the NLI inference on the sentences to be classified for each class.
|
49 |
- *references* , *(numpy.array(int32)[sentences to classify,2]: numpy array with the reference *label_ids* and *nli_label* of the sentences to be classified, given in the *test_dataset*.
|
50 |
-
- *kwarg* named *prediction_strategies = list(list(str, int(optional)))*, each *list(list(str, int(optional)))* describing a desired prediction strategy
|
51 |
-
|
52 |
-
|
53 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
54 |
|
55 |
### Output Values
|
56 |
|
@@ -66,4 +67,4 @@ BibLaTeX
|
|
66 |
url = {https://huggingface.co/spaces/gorkaartola/metric_for_tp_fp_samples},
|
67 |
urldate = {2022-08-11}
|
68 |
}
|
69 |
-
```
|
|
|
21 |
In addition to the classical *predictions* and *references* inputs, this metric includes a *kwarg* named *prediction_strategies (list(str))*, that refer to a family of prediction strategies that the metric can handle.
|
22 |
|
23 |
Add *predictions*, *references* and *prediction_strategies* as follows:
|
24 |
+
```
|
25 |
metric = evaluate.load(metric_selector)
|
26 |
metric.add_batch(predictions = predictions, references = references)
|
27 |
results = metric.compute(prediction_strategies = prediction_strategies)
|
28 |
+
```
|
29 |
|
30 |
+
The minimum fields required by this metric for the test datasets are the following (not necessarily with these names):
|
31 |
+
- *title* containing the first sentence to be compared with different queries representing each class.
|
32 |
+
- *label_ids* containing the *id* of the class the sample refers to. Including samples of all the classes is advised.
|
33 |
+
- *nli_label* which is '0' if the sample represents a True Positive or '2' if the sample represents a False Positive, meaning that the *label_ids* is incorrectly assigned to the *title*. Including both True Positive and False Positive samples for all classes is advised.
|
|
|
|
|
|
|
|
|
|
|
34 |
|
35 |
+
Example:
|
36 |
+
|title |label_ids |nli_label |
|
37 |
+
|-----------------------------------------------------------------------------------|:---------:|:----------:|
|
38 |
+
|'Together we can save the arctic': celebrity advocacy and the Rio Earth Summit 2012| 8 | 0 |
|
39 |
+
|Tuple-based semantic and structural mapping for a sustainable interoperability | 16 | 2 |
|
|
|
40 |
|
41 |
### Inputs
|
42 |
|
43 |
- *predictions*, *(numpy.array(float32)[sentences to classify,number of classes])*: numpy array with the softmax logits values of the entailment dimension of the NLI inference on the sentences to be classified for each class.
|
44 |
- *references* , *(numpy.array(int32)[sentences to classify,2]: numpy array with the reference *label_ids* and *nli_label* of the sentences to be classified, given in the *test_dataset*.
|
45 |
+
- *kwarg* named *prediction_strategies = list(list(str, int(optional)))*, each *list(list(str, int(optional)))* describing a desired prediction strategy. The *prediction_strategies* implemented in this metric are:
|
46 |
+
- *argmax*, which takes the highest value of the softmax inference logits to select the prediction. Syntax: *["argmax_max"]*
|
47 |
+
- *threshold*, which takes all softmax inference logits above a certain value to select the predictions. Syntax: *["threshold", desired value]*
|
48 |
+
- *topk*, which takes the highest *k* softmax inference logits to select the predictions. Syntax: *["topk", desired value]*
|
49 |
+
|
50 |
+
Example:
|
51 |
+
```
|
52 |
+
prediction_strategies = [['argmax_max'],['threshold', 0.5],['topk,3']]
|
53 |
+
```
|
54 |
+
|
55 |
|
56 |
### Output Values
|
57 |
|
|
|
67 |
url = {https://huggingface.co/spaces/gorkaartola/metric_for_tp_fp_samples},
|
68 |
urldate = {2022-08-11}
|
69 |
}
|
70 |
+
```
|