This research aims at developing and utilizing a knowledge graph in the field of Dutch maritime data. Such data currently can be scattered over several datasets, have different semantics over the variables, and can have limited availability. However, these problems are not inherently solved by creating a linked data knowledge ...
(Show more)This research aims at developing and utilizing a knowledge graph in the field of Dutch maritime data. Such data currently can be scattered over several datasets, have different semantics over the variables, and can have limited availability. However, these problems are not inherently solved by creating a linked data knowledge graph. The linked data still has to be accessible and usable by domain researchers.
The Vereenigde Oostindische Compagnie (VOC) was a large trading company founded in 1602 in the Netherlands. The VOC created and curated several written logbooks. These logbooks covered, for instance, the sailors who ventured on the journey to the East Indies and who might never return [1]. The logbooks also cover the cargo shipped by these journeys. Some rather innocent, such as spices, while others less so [2]. These logbooks were eventually converted to a digital format by efforts of archives and research institutes. By diving into these datasets, researchers can compose ‘stories’ about the people who lived then or connect the dots and discover some greater coherent theory. Publishing and managing these datasets in an open and structured method boosts research projects’ feasibility and value. Sharing data in standard formats such as XML and CSV is possible but comes with downsides. One is that the interpretation of datasets and files is not always straightforward; variables and values can be interpreted in different ways.
A knowledge graph can provide a solution to these problems. It can drastically reduce the time a researcher has to spend browsing and preparing different datasets. Instead, queries over multiple datasets linked in a knowledge graph can be answered almost instantly, even with the most complex questions. An additional benefit of a knowledge graph is the notion of a common vocabulary. In a common vocabulary, concepts, classes and properties are well defined and structured. Sharing information in a knowledge graph is easier for machines as well. With the use of the standardised SPARQL query language, multiple graphs can be queried simultaneously.
The research paper and the knowledge graph are validated by satisfying the competency questions of domain experts. The utility of the knowledge graph can be shown if these competency questions can be answered by querying the graph. A domain expert provided the first draft of competency questions:
- Which VOC Chamber was accountable for the highest number of slaves transported?
- How were the shipping routes in Asia divided between the various VOC chambers (for example, did ships from a specific chamber only sailed on certain routes)?
- What was the average value of cargo on VOC return voyages per crew member/ship's ton, and how did this evolve over time?
- To what extent was the value per ships' ton on return voyages correlated with the skipper's track record (how much experience, i.e., how many previous voyages / how quickly did a ship get to Asia on an outbound voyage)?
Other competency questions related to the development of the knowledge graph can also be verified, such as: “What is a sustainable method of developing a knowledge graph related to Dutch maritime data?” and “How can a useful knowledge graph be developed for Dutch maritime history?”.
The Dutch maritime data was provided, curated, and described by the researchers of the Huygens ING and by the Dutch National Archives. The common vocabulary used thus far is an adjusted version of the CIDOC CRM.
[1]
https://www.nationaalarchief.nl/onderzoeken/zoekhulpen/voc-opvarenden
[2]
http://resources.huygens.knaw.nl/boekhoudergeneraalbatavia
(Show less)