vectara/hallucination_evaluation_model · Score Generation Discrepancy in vectara/hallucination_evaluation

I'm encountering an unexpected behavior with the vectara/hallucination_evaluation_model. When I input identical/same sentences as both premise and hypothesis, the model outputs a score of approximately 0.93 instead of the expected 1.0.

I'm curious about the underlying scoring mechanism and potential reasons for this discrepancy. Any insights into the model's scoring function or potential biases would be greatly appreciated.

Here are some potential areas of exploration:

->How is the similarity between premise and hypothesis calculated?
->Are there any known limitations or biases in the model's scoring system?
->Could there be data-related issues affecting the score?

I'm looking forward to discussing this issue with the community and finding a solution.

vectara
/

hallucination_evaluation_model

Score Generation Discrepancy in vectara/hallucination_evaluation_model