date: 2021-12-06T05:55:56Z pdf:unmappedUnicodeCharsPerPage: 17 pdf:PDFVersion: 1.7 pdf:docinfo:title: CIDOC2VEC: Extracting Information from Atomized CIDOC-CRM Humanities Knowledge Graphs xmp:CreatorTool: LaTeX with hyperref Keywords: data extraction; knowledge graph; CIDOC-CRM; digital humanities; Sphaera; recommender system access_permission:modify_annotations: true access_permission:can_print_degraded: true subject: The development of the field of digital humanities in recent years has led to the increased use of knowledge graphs within the community. Many digital humanities projects tend to model their data based on CIDOC-CRM ontology, which offers a wide array of classes appropriate for storing humanities and cultural heritage data. The CIDOC-CRM ontology model leads to a knowledge graph structure in which many entities are often linked to each other through chains of relations, which means that relevant information often lies many hops away from their entities. In this paper, we present a method based on graph walks and text processing to extract entity information and provide semantically relevant embeddings. In the process, we were able to generate similarity recommendations as well as explore their underlying data structure. This approach was then demonstrated on the Sphaera Dataset which was modeled according to the CIDOC-CRM data structure. dc:creator: Hassan El-Hajj and Matteo Valleriani dcterms:created: 2021-12-06T05:39:51Z Last-Modified: 2021-12-06T05:55:56Z dcterms:modified: 2021-12-06T05:55:56Z dc:format: application/pdf; version=1.7 title: CIDOC2VEC: Extracting Information from Atomized CIDOC-CRM Humanities Knowledge Graphs Last-Save-Date: 2021-12-06T05:55:56Z pdf:docinfo:creator_tool: LaTeX with hyperref access_permission:fill_in_form: true pdf:docinfo:keywords: data extraction; knowledge graph; CIDOC-CRM; digital humanities; Sphaera; recommender system pdf:docinfo:modified: 2021-12-06T05:55:56Z meta:save-date: 2021-12-06T05:55:56Z pdf:encrypted: false dc:title: CIDOC2VEC: Extracting Information from Atomized CIDOC-CRM Humanities Knowledge Graphs modified: 2021-12-06T05:55:56Z cp:subject: The development of the field of digital humanities in recent years has led to the increased use of knowledge graphs within the community. Many digital humanities projects tend to model their data based on CIDOC-CRM ontology, which offers a wide array of classes appropriate for storing humanities and cultural heritage data. The CIDOC-CRM ontology model leads to a knowledge graph structure in which many entities are often linked to each other through chains of relations, which means that relevant information often lies many hops away from their entities. In this paper, we present a method based on graph walks and text processing to extract entity information and provide semantically relevant embeddings. In the process, we were able to generate similarity recommendations as well as explore their underlying data structure. This approach was then demonstrated on the Sphaera Dataset which was modeled according to the CIDOC-CRM data structure. pdf:docinfo:subject: The development of the field of digital humanities in recent years has led to the increased use of knowledge graphs within the community. Many digital humanities projects tend to model their data based on CIDOC-CRM ontology, which offers a wide array of classes appropriate for storing humanities and cultural heritage data. The CIDOC-CRM ontology model leads to a knowledge graph structure in which many entities are often linked to each other through chains of relations, which means that relevant information often lies many hops away from their entities. In this paper, we present a method based on graph walks and text processing to extract entity information and provide semantically relevant embeddings. In the process, we were able to generate similarity recommendations as well as explore their underlying data structure. This approach was then demonstrated on the Sphaera Dataset which was modeled according to the CIDOC-CRM data structure. Content-Type: application/pdf pdf:docinfo:creator: Hassan El-Hajj and Matteo Valleriani X-Parsed-By: org.apache.tika.parser.DefaultParser creator: Hassan El-Hajj and Matteo Valleriani meta:author: Hassan El-Hajj and Matteo Valleriani dc:subject: data extraction; knowledge graph; CIDOC-CRM; digital humanities; Sphaera; recommender system meta:creation-date: 2021-12-06T05:39:51Z created: 2021-12-06T05:39:51Z access_permission:extract_for_accessibility: true access_permission:assemble_document: true xmpTPg:NPages: 18 Creation-Date: 2021-12-06T05:39:51Z pdf:charsPerPage: 3796 access_permission:extract_content: true access_permission:can_print: true meta:keyword: data extraction; knowledge graph; CIDOC-CRM; digital humanities; Sphaera; recommender system Author: Hassan El-Hajj and Matteo Valleriani producer: pdfTeX-1.40.21 access_permission:can_modify: true pdf:docinfo:producer: pdfTeX-1.40.21 pdf:docinfo:created: 2021-12-06T05:39:51Z