FarsWikiKG: an Automatically Constructed Knowledge Graph for Persian

Document Type : Original Article


Amirkabir University of Technology


We present FarsWikiKG, a Persian knowledge graph extracted from Wikipedia. Wikipedia infoboxes have been used as a valuable resource for building knowledge graphs in recent years. FarsWikiKG consists of more than 2 million entities, as well as 5.7 million facts about the entities. Using Wikidata, we constructed an ontology with more than 6000 classes representing entity types. As the second Persian knowledge graph, which has the ability of self-update, FarsWikiKG shows improvement on NLP tasks, especially question answering systems. Although FarsWikiKG is a dynamic knowledge graph, our evaluation shows a coverage of 90% on Persian Wikipedia pages. As Wikipedia information is constantly changing, a fixed knowledge graph can provide unstable data to the user. The proposed system, in addition to solving the problem of unstable data, reduces the need for experts to extract and construct knowledge graphs manually. Storing information in RDF as a standard method of storing knowledge graph information, FarsWikiKG allows NLP systems to run SPARQL queries on it.


Saeedeh Momtazi is currently an associate professor at Amirkabir University of Technology (AUT), Iran. She completed her BSc and MSc education at Sharif University of Technology, Iran. She received a PhD degree in Artificial Intelligence from Saarland University, Germany. As part of her PhD, she was a visiting researcher at the Center of Language and Speech Processing at Johns Hopkins University, US. After finishing the PhD, she worked at the Hasso-Plattner Institute (HPI) at Potsdam University, Germany and the German Institute for International Educational Research (DIPF), Germany as a postdoctoral researcher. Natural language processing is her main research focus.

Farhad Shirmardi received his BSc Computer Science at Amirkabir University of Technology (AUT), Iran in 2018. He received his MSc in Artificial Intelligence from Amirkabir University of Technology (AUT), Iran. His research interesets are Question Answering and Knowledge Graphs.


Mohammad Hadi Hosseisni He received his bachelor’s degree in computer engineering from Amirkabir University of Technology (AUT), Iran in 2021. His research interests include machine learning, and algorithms, natural language processing.