--- title: matching_series tags: - evaluate - metric description: "Matching-based time-series generation metric" sdk: gradio sdk_version: 3.50 app_file: app.py pinned: false --- # Metric Card for matching_series ## Metric Description Matching Series is a metric for evaluating time-series generation models. It is based on the idea of matching the generated time-series with the original time-series. The metric calculates the Mean Squared Error (MSE) between the generated time-series and the original time-series between matched instances. The metric outputs a score greater or equal to 0, where 0 indicates a perfect generation. ## How to Use At minium, the metric requires the original time-series and the generated time-series as input. The metric can be used to evaluate the performance of time-series generation models. ```python >>> num_generation = 100 >>> num_reference = 10 >>> seq_len = 100 >>> num_features = 10 >>> references = np.random.rand(num_reference, seq_len, num_features) >>> predictions = np.random.rand(num_generation, seq_len, num_features) >>> metric = evaluate.load("bowdbeg/matching_series") >>> results = metric.compute(references=references, predictions=predictions, batch_size=1000) >>> print(results) {'matching_mse': 0.15873331613053895, 'harmonic_mean': 0.15623569099681772, 'covered_mse': 0.15381544718035087, 'index_mse': 0.16636189201532087, 'matching_mse_features': [0.13739837269222452, 0.1395309409295018, 0.13677679887355126, 0.14408421162706211, 0.1430115910456261, 0.13726657544044085, 0.14274372684301717, 0.13504614539190338, 0.13853582796877975, 0.14482307626368343], 'harmonic_mean_features': [0.1309991815519093, 0.13157175020534279, 0.12735134531950718, 0.1327483317911355, 0.1336402851605765, 0.12878380179856022, 0.1344831997941457, 0.12782689483798823, 0.12909420446395195, 0.13417435670997752], 'covered_mse_features': [0.12516953618356524, 0.12447158260731798, 0.11914118322950448, 0.12306606276504639, 0.1254216201001874, 0.12128844181049621, 0.12712643943219143, 0.12134032531607968, 0.12085741660832867, 0.12498436126166071], 'index_mse_features': [0.16968036010688156, 0.1624888691672768, 0.15926142198600082, 0.17250634507748022, 0.16713668302081525, 0.16663213728264645, 0.1596766027744231, 0.16251306560725656, 0.17160303243460656, 0.17212040269582168], 'macro_matching_mse': 0.13992172670757905, 'macro_covered_mse': 0.12328669693143782, 'macro_harmonic_mean': 0.13106733516330948, 'macro_index_mse': 0.1663618920153209} ``` ### Inputs - **predictions**: (list of list of list of float or numpy.ndarray): The generated time-series. The shape of the array should be `(num_generation, seq_len, num_features)`. - **references**: (list of list of list of float or numpy.ndarray): The original time-series. The shape of the array should be `(num_reference, seq_len, num_features)`. - **batch_size**: (int, optional): The batch size for computing the metric. This affects quadratically. Default is None. ### Output Values Let prediction instances be $P = \{p_1, p_2, \ldots, p_n\}$ and reference instances be $R = \{r_1, r_2, \ldots, r_m\}$. - **matching_mse**: (float): Average of the MSE between the generated instance and the reference instance with the lowest MSE. Intuitively, This is similar to precision in classification. In the equation, $\frac{1}{n} \sum_{i=1}^{n} \min_{j} \mathrm{MSE}(p_i, r_j)$. - **covered_mse**: (float): Average of the MSE between the reference instance and the with the lowest MSE. Intuitively, This is similar to recall in classification. In the equation, $\frac{1}{m} \sum_{j=1}^{m} \min_{i} \mathrm{MSE}(p_i, r_j)$. - **harmonic_mean**: (float): Harmonic mean of the matching_mse and covered_mse. This is similar to F1-score in classification. - **index_mse**: (float): Average of the MSE between the generated instance and the reference instance with the same index. In the equation, $\frac{1}{n} \sum_{i=1}^{n} \mathrm{MSE}(p_i, r_i)$. - **matching_mse_features**: (list of float): matching_mse computed individually for each feature. - **covered_mse_features**: (list of float): covered_mse computed individually for each feature. - **harmonic_mean_features**: (list of float): harmonic_mean computed individually for each feature. - **index_mse_features**: (list of float): index_mse computed individually for each feature. - **macro_matching_mse**: (float): Average of the matching_mse_features. - **macro_covered_mse**: (float): Average of the covered_mse_features. - **macro_harmonic_mean**: (float): Average of the harmonic_mean_features. - **macro_index_mse**: (float): Average of the index_mse_features. #### Values from Popular Papers ### Examples ## Limitations and Bias This metric is based on the assumption that the generated time-series should match the original time-series. This may not be the case in some scenarios. The metric may not be suitable for evaluating time-series generation models that are not required to match the original time-series. ## Citation ## Further References