The reported results do not match what is written in the paper.
The paper reported a micro F1 Score.

A 70% micro F1 Score on the REBEL Dataset would also not be SOTA, since REBEL achieved 74 micro-F1 Score (Section 5 in https://aclanthology.org/2021.findings-emnlp.204.pdf)

We evaluated the KnowGL model using the experimental setup described in the GenIE paper (Table 1 https://aclanthology.org/2022.naacl-main.342.pdf) where the F1 is 68.93.
Details about KnowGL evaluation are described in the paper https://arxiv.org/abs/2207.05188, that's already mentioned in the current README.md

gaetangate changed pull request status to closed

@mingaflo The evaluation done by the REBEL authors is not the same as the one done by the GenIE authors and by us. Please look at Figure 1 in https://aclanthology.org/2021.findings-emnlp.204.pdf and compare it with the Figure 1 in https://aclanthology.org/2022.naacl-main.342.pdf . The extracted triples do not contain same kind of information.

Given a text "US president Biden was born in Pennsylvania", the REBEL system is expected to generate "<Biden, position held, US president>" or "<US, president, Biden>". Here "US" and "Biden" are entity mentions which are already inside the text.
But the GenIE system (and our KnowGL system) is expected to generate "<Joe Biden, position held, President of the United States>" or "<United States of America, president, Joe Biden>". "United States of America" and "Joe Biden" are corresponding Wikidata labels for those entities, and are not necessarily inside the text.

Hope this clarifies any confusion.

Thanks for clarifying!

Sign up or log in to comment