Spaces:

gorkaartola
/

metric_for_tp_fp_samples

Runtime error

App Files Files Community

gorkaartola commited on Aug 13, 2022

Commit

7e0ba0e

•

1 Parent(s): 1480616

Upload README.md

Browse files

Files changed (1) hide show

README.md +22 -21

README.md CHANGED Viewed

@@ -21,36 +21,37 @@ This metric is specially designed to measure the performance of sentence classif
 In addition to the classical *predictions* and *references* inputs, this metric includes a *kwarg* named *prediction_strategies (list(str))*, that refer to a family of prediction strategies that the metric can handle.
 Add *predictions*, *references* and *prediction_strategies* as follows:
 	metric = evaluate.load(metric_selector)
 	metric.add_batch(predictions = predictions, references = references)
 	results = metric.compute(prediction_strategies = prediction_strategies)
-The *prediction_strategies* implemented in this metric are:
-		- *argmax*, which takes the highest value of the softmax inference logits to select the prediction.
-		- *threshold*, which takes all softmax inference logits above a certain value to select the predictions.
-		- *topk*, which takes the highest *k* softmax inference logits to select the predictions.
-		The minimum fields required by this metric for the test datasets are the following (not necessarily with these names):
-		- *title* containing the first sentence to be compared with different queries representing each class.
-		- *label_ids* containing the *id* of the class the sample refers to. Including samples of all the classes is advised.
-		- *nli_label* which is '0' if the sample represents a True Positive or '2' if the sample represents a False Positive, meaning that the *label_ids* is incorrectly assigned to the *title*. Including both True Positive and False Positive samples for all classes is advised.
-		Example:
-		|title                                                                              |label_ids  |nli_label   |
-		|-----------------------------------------------------------------------------------|:---------:|:----------:|
-		|'Together we can save the arctic': celebrity advocacy and the Rio Earth Summit 2012|     8     |     0      |
-		|Tuple-based semantic and structural mapping for a sustainable interoperability     |     16    |     2      |
 ### Inputs
 	- *predictions*, *(numpy.array(float32)[sentences to classify,number of classes])*: numpy array with the softmax logits values of the entailment dimension of the NLI inference on the sentences to be classified for each class.
 	- *references* , *(numpy.array(int32)[sentences to classify,2]: numpy array with the reference *label_ids* and *nli_label* of the sentences to be classified, given in the *test_dataset*.
-	- *kwarg* named *prediction_strategies = list(list(str, int(optional)))*, each *list(list(str, int(optional)))* describing a desired prediction strategy as follows:
-		+ *argmax*: *["argmax"]*.
-		+ *threshold*: *["threshold", desired value]*.
-		+ *topk*: ["topk", desired value]*.
 ### Output Values
@@ -66,4 +67,4 @@ BibLaTeX
   url = {https://huggingface.co/spaces/gorkaartola/metric_for_tp_fp_samples},
   urldate = {2022-08-11}
 }
-```

 In addition to the classical *predictions* and *references* inputs, this metric includes a *kwarg* named *prediction_strategies (list(str))*, that refer to a family of prediction strategies that the metric can handle.
 Add *predictions*, *references* and *prediction_strategies* as follows:
+```
 	metric = evaluate.load(metric_selector)
 	metric.add_batch(predictions = predictions, references = references)
 	results = metric.compute(prediction_strategies = prediction_strategies)
+```
+The minimum fields required by this metric for the test datasets are the following (not necessarily with these names):
+	- *title* containing the first sentence to be compared with different queries representing each class.
+	- *label_ids* containing the *id* of the class the sample refers to. Including samples of all the classes is advised.
+	- *nli_label* which is '0' if the sample represents a True Positive or '2' if the sample represents a False Positive, meaning that the *label_ids* is incorrectly assigned to the *title*. Including both True Positive and False Positive samples for all classes is advised.
+	Example:
+	|title                                                                              |label_ids  |nli_label   |
+	|-----------------------------------------------------------------------------------|:---------:|:----------:|
+	|'Together we can save the arctic': celebrity advocacy and the Rio Earth Summit 2012|     8     |     0      |
+	|Tuple-based semantic and structural mapping for a sustainable interoperability     |     16    |     2      |
 ### Inputs
 	- *predictions*, *(numpy.array(float32)[sentences to classify,number of classes])*: numpy array with the softmax logits values of the entailment dimension of the NLI inference on the sentences to be classified for each class.
 	- *references* , *(numpy.array(int32)[sentences to classify,2]: numpy array with the reference *label_ids* and *nli_label* of the sentences to be classified, given in the *test_dataset*.
+	- *kwarg* named *prediction_strategies = list(list(str, int(optional)))*, each *list(list(str, int(optional)))* describing a desired prediction strategy. The *prediction_strategies* implemented in this metric are:
+		- *argmax*, which takes the highest value of the softmax inference logits to select the prediction. Syntax: *["argmax_max"]*
+		- *threshold*, which takes all softmax inference logits above a certain value to select the predictions. Syntax: *["threshold", desired value]*
+		- *topk*, which takes the highest *k* softmax inference logits to select the predictions. Syntax: *["topk", desired value]*
+		Example:
+```
+	prediction_strategies = [['argmax_max'],['threshold', 0.5],['topk,3']]
+```
 ### Output Values
   url = {https://huggingface.co/spaces/gorkaartola/metric_for_tp_fp_samples},
   urldate = {2022-08-11}
 }
+```