pedagogy, psychoanalysis, semantic webs, data mining, novels & tibetan terriers
The semantic web group
Historical accounting documents are a genre of texts that have considerable research potential if we treat them as humanities sources. MEDEA is a cooperative international project whose principal investigators recommend creating digital scholarly editions of accounts as a first step in a process that will open the information contained within them to the affordances of the Semantic Web. MEDEA researchers are at work on a bookkeeping ontology that can be used to intermediate between XML markup and exposing Linked Open Data as RDF. The information contained in the texts of accounts can then be used to explore humanities questions at levels from the granular or local to the regional or global. This paper reflects presentations from a multi-speaker session at DH2016 in which MEDEA participants discussed the kinds of humanities information found in accounts, the forms of electronic representation available for working with them, and an evolving bookkeeping ontology based on CIDOC-CRM.
This article considers the use of semantic web technologies in the context of everyday historians. It deduces from theoretical considerations needs for the actual implementation of a digital edition. It explains some of the basic concepts of the semantic web more extensively than necessary for the digital humanities scholar already familiar with these technologies. I’ve argued elsewhere why a digital edition can be considered the best method to publish economic records as historical sources. It discusses first discusses the drawbacks of reducing digital edition of accounts and economic records to the encoding offered by the TEI. I will compare the text oriented approach of the TEI with other digital representations of accounts that are oriented primarily on the economic facts accounted. The second part of the article discusses the opportunities offered by the usage of semantic web technologies (RDF, RDFs/OWL, SKOS and SPARQL) to encode and expose the content layer of digital editions. I have described elsewhere in more detail my own proposal how a customized XML/TEI transcription can be transformed into a XML serialisation of RDF facts, and there are other projects interlacing RDF structures into TEI. This article focus on an introduction into the semantic web technologies as proposed by the W3C and discusses how they can be applied to historical accounts as a common data model, for the creation of controlled vocabularies, in exposing the content layer over the web, and for querying data aggregated from several sources. The final part of the article exemplifies the whole set of methods on data extracted from existing digital editions of late medieval accounts. The presented in this paper is part the MEDEA activities funded by DFG and NEH.
Named entity recognition for novel domains can be challenging in the absence of suitable training materials for machine-learning or lexicons and gazetteers for term look-up. We describe an approach that starts from a small, manually created word list of commodities traded in the nineteenth century, and then uses semantic web techniques to augment the list by an order of magnitude, drawing on data stored in DBpedia. This work was conducted during the Trading Consequences project on text mining and visualisation of historical documents for the study of global trading in the British empire.
As part of Web 2.0 (Semantic Web), there is a new technology called FOAF (Friend of a Friend), describing relationships between people. We will investigate the applicability of FOAF for describing relationships between musicians of the past, thereby establishing a new biographical tool. Musicians have complex relationships,particularly those between teachers and students and those within ensembles of various sizes. Visual artists may have similar teacher-student relationships, but typically do not create their work together. Dancers may perform together, but they are usually taught in groups. Similarly, athletes may compete in groups, but they do not usually perform in public with their coaches. For this project we will focus specifically on relationships among Renaissance musicians and how to extract the biographical and relational data automatically from existing documents using natural language processing technology, creating a model applicable to other time periods and disciplines.
Hi all, today I wanted to invite comments about a blog post of mine, but ended up being unsure of where to put this. At first I thought the post was a Doc (it’s a rather long and systematic blogpost) and tried to submit it to CORE, sharing it with the relevant Groups, but it […]
In 1999, the Electronic Literature Organization developed a comprehensive directory of electronic literature that has guided readers to thousands of works of electronic literature and helped to develop an international humanities discipline. But as the nature and complexion of the field has changed and matured, the directory has become both technologically and conceptually outdated. A decade after the release of the first incarnation of the directory, the authors and scholars at the Electronic Literature Organization will rebuild the Electronic Literature Directory using an open source, collaborative knowledge management platform and Semantic Web-based tools. The completely reconstructed directory will make records of works of electronic literature more accessible to the public, a team of editors will develop a metatag vocabulary and revise descriptions of listed works, and the finished product will show works in the context of critical scholarship about electronic literature.
The MarineLives project uses a semantic media wiki as its platform to view and transcribe manuscript images, to annotate transcribed pages, and to create semantic biographies. All semantic biographies contain geographical location data. For example, semantic biographies for the hamlet of Limehouse in the parish of Stepney in the county of Middlesex. RDF data are available for […]
The MarineLives project uses a semantic media wiki as its platform to view and transcribe manuscript images, to annotate transcribed pages, and to create semantic biographies. RDF data are available for download and reuse by other researchers. All data downloadable on a CC BY 3.0 licence. The MarineLives wiki is built on a PHP-based […]