---
title: matching_series
tags:
- evaluate
- metric
description: "Matching-based time-series generation metric"
sdk: gradio
sdk_version: 3.50
app_file: app.py
pinned: false
---

# Metric Card for matching_series

## Metric Description
Matching Series is a metric for evaluating time-series generation models. It is based on the idea of matching the generated time-series with the original time-series. The metric calculates the Mean Squared Error (MSE) between the generated time-series and the original time-series between matched instances. The metric outputs a score greater or equal to 0, where 0 indicates a perfect generation.

## How to Use
At minium, the metric requires the original time-series and the generated time-series as input. The metric can be used to evaluate the performance of time-series generation models.

```python
>>> num_generation = 100
>>> num_reference = 10
>>> seq_len = 100
>>> num_features = 10
>>> references = np.random.rand(num_reference, seq_len, num_features)
>>> predictions = np.random.rand(num_generation, seq_len, num_features)
>>> metric = evaluate.load("bowdbeg/matching_series")
>>> results = metric.compute(references=references, predictions=predictions, batch_size=1000)
>>> print(results)
{'matching_mse': 0.15873331613053895, 'harmonic_mean': 0.15623569099681772, 'covered_mse': 0.15381544718035087, 'index_mse': 0.16636189201532087, 'matching_mse_features': [0.13739837269222452, 0.1395309409295018, 0.13677679887355126, 0.14408421162706211, 0.1430115910456261, 0.13726657544044085, 0.14274372684301717, 0.13504614539190338, 0.13853582796877975, 0.14482307626368343], 'harmonic_mean_features': [0.1309991815519093, 0.13157175020534279, 0.12735134531950718, 0.1327483317911355, 0.1336402851605765, 0.12878380179856022, 0.1344831997941457, 0.12782689483798823, 0.12909420446395195, 0.13417435670997752], 'covered_mse_features': [0.12516953618356524, 0.12447158260731798, 0.11914118322950448, 0.12306606276504639, 0.1254216201001874, 0.12128844181049621, 0.12712643943219143, 0.12134032531607968, 0.12085741660832867, 0.12498436126166071], 'index_mse_features': [0.16968036010688156, 0.1624888691672768, 0.15926142198600082, 0.17250634507748022, 0.16713668302081525, 0.16663213728264645, 0.1596766027744231, 0.16251306560725656, 0.17160303243460656, 0.17212040269582168], 'macro_matching_mse': 0.13992172670757905, 'macro_covered_mse': 0.12328669693143782, 'macro_harmonic_mean': 0.13106733516330948, 'macro_index_mse': 0.1663618920153209}
```

### Inputs
- **predictions**: (list of list of list of float or numpy.ndarray): The generated time-series. The shape of the array should be `(num_generation, seq_len, num_features)`.
- **references**: (list of list of list of float or numpy.ndarray): The original time-series. The shape of the array should be `(num_reference, seq_len, num_features)`.
- **batch_size**: (int, optional): The batch size for computing the metric. This affects quadratically. Default is None.

### Output Values

Let prediction instances be $P = \{p_1, p_2, \ldots, p_n\}$ and reference instances be $R = \{r_1, r_2, \ldots, r_m\}$.

- **matching_mse**: (float): Average of the MSE between the generated instance and the reference instance with the lowest MSE. Intuitively, This is similar to precision in classification. In the equation, $\frac{1}{n} \sum_{i=1}^{n} \min_{j} \mathrm{MSE}(p_i, r_j)$.
- **covered_mse**: (float): Average of the MSE between the reference instance and the  with the lowest MSE. Intuitively, This is similar to recall in classification. In the equation, $\frac{1}{m} \sum_{j=1}^{m} \min_{i} \mathrm{MSE}(p_i, r_j)$.
- **harmonic_mean**: (float): Harmonic mean of the matching_mse and covered_mse. This is similar to F1-score in classification.
- **index_mse**: (float): Average of the MSE between the generated instance and the reference instance with the same index. In the equation, $\frac{1}{n} \sum_{i=1}^{n} \mathrm{MSE}(p_i, r_i)$.
- **matching_mse_features**: (list of float): matching_mse computed individually for each feature.
- **covered_mse_features**: (list of float): covered_mse computed individually for each feature.
- **harmonic_mean_features**: (list of float): harmonic_mean computed individually for each feature.
- **index_mse_features**: (list of float): index_mse computed individually for each feature.
- **macro_matching_mse**: (float): Average of the matching_mse_features.
- **macro_covered_mse**: (float): Average of the covered_mse_features.
- **macro_harmonic_mean**: (float): Average of the harmonic_mean_features.
- **macro_index_mse**: (float): Average of the index_mse_features.

#### Values from Popular Papers
<!-- *Give examples, preferrably with links to leaderboards or publications, to papers that have reported this metric, along with the values they have reported.* -->

### Examples
<!-- *Give code examples of the metric being used. Try to include examples that clear up any potential ambiguity left from the metric description above. If possible, provide a range of examples that show both typical and atypical results, as well as examples where a variety of input parameters are passed.* -->

## Limitations and Bias
This metric is based on the assumption that the generated time-series should match the original time-series. This may not be the case in some scenarios. The metric may not be suitable for evaluating time-series generation models that are not required to match the original time-series.

## Citation
<!-- *Cite the source where this metric was introduced.* -->

## Further References
<!-- *Add any useful further references.* -->