EncycNet: A Historical Encyclopaedic Information System
(Funded by DFG - Scientific Library Sciences and Information Systems)
CDS members associated with the project: Prof. Dr. Andreas Witt
Subject of the project are development techniques for the integration of various -- especially historical -- encyclopedias and lexicons into a comprehensive information system. The result of the project will be a "historical Wikipedia", organized on the basis of a diachronic concept graph and inspired by the functionality of semantic networks (e. g. WordNet). The aim is to analyse and integrate heterogeneous encyclopaedic works at the level of the concepts discussed in them, i.e. trace back different lemmas, terms and proper names to their common entities and to divide the content of their glosses into different concepts and semantic classes. On the basis of the thus determined entities, 1) a concept graph in its synchronous and diachronic dimension will be generated, and 2) a hierarchy of entities and knowledge about their relations to each other will be established. The resulting ontology corresponds to an information system that allows detailed queries and inferences about the entire encyclopedia collection. Each entity should be assigned to its equivalent in Wikipedia/DBpedia, a standard file (GND) or at least one semantic class from the Wikipedia category system. The development is carried out on the basis of a collection of historical German-language texts, but the resulting indexing techniques and data models are in principle also suitable for the creation of a multilingual system and are not limited to the text type of the encyclopaedia. Due to the depth of the information extraction, the modelling of the temporal dimension and the induction of a formalised knowledge base directly from historical material, as an encyclopaedia-network is a novel resource capable of lastingly changing the work of research communities at the interface between the humanities and computer science.