Study of cultural relations of European countries through Wikipedia-based and real-world data

Project description: 
  • Data basis: In previous work we inferred cultural relations between language communities by mining similarities in content and view statistics of articles (e.g. about food or music)  in different Wikipedia language editions. See here and here.
  • Dataset can be found here.
  • Goal: Propose an approach to validate the results from this study. For example, validity could be established by correlating results with external data such as tourism statistics, since it is plausible to assume that cultural understanding is higher if more people from country A visit country B. Explore these and similar correlations and critically discuss how one can establish validity when proposing complex measurements such as cultural understanding or interest.
  • Method: Analyse the relationship between external data and Wikipedia-based inferred relationships between language communities; propose alternative methods for ensuring the quality of web-based measurements of cultural similarities, interest and understanding between different language communities.
  • Team: should consist of at least 50% members with at least medium programming skills