Data releases

Data release 1.0


@InProceedings{vossen-EtAl:2020:LREC,
author = {Vossen, Piek and Ilievski, Filip and Postma, Marten and Fokkens, Antske and Minnema, Gosse and Remijnse, Levi},
title = {Large-scale Cross-lingual Language Resources for Referencing and Framing},
booktitle = {Proceedings of The 12th Language Resources and Evaluation Conference},
month = {May},
year = {2020},
address = {Marseille, France},
publisher = {European Language Resources Association},
pages = {3162–3171},
url = {https://www.aclweb.org/anthology/2020.lrec-1.387}
}

Have you ever wondered how the same event is described in different languages? Then this dataset might be useful to you.
From Wikidata, we’ve selected 25 event types, e.g., military operation (see the paper Section 6 for more information).
In total, we collected 19,979 Wikidata items that belong to these 25 event types.
For each Wikidata item, we attempted to retrieve the first paragraph of the Wikipedia page describing the Wikidata item.
We included English, Italian, and Dutch texts, which we processed using various NLP systems.
Also, we represent structured data about each Wikidata item, which facilitates research into the framing of events.

You can download it from here.