DepositLexomic Tools and Methods for Textual Analysis: Providing Deep Access to Digitized Texts

This project hybridizes traditional humanistic approaches to textual scholarship, such as source study and the analysis of style, with advanced computational and statistical comparative methods, allowing scholars “deep access” to digitized texts and textual corpora. Our multi-disciplinary collaboration enables us to discover patterns in (and between) texts previously invisible to traditional methods. Going forward, we will build on the success of our previous Digital Humanities Start-up Grant by further developing tools and documentation (in an open, on-line community) for applying advanced statistical methodologies to textual and literary problems. At the same time we will demonstrate the value of the approach by applying the tools and methods to texts from a variety of languages and time periods, including Old English, medieval Latin, and Modern English works from the twentieth-century Harlem Renaissance.

DepositTrading Consequences: A Case Study of Combining Text Mining and Visualization to Facilitate Document Exploration

Large-scale digitization efforts and the availability of computational methods, including text mining and information visualization, have enabled new approaches to historical research. However, we lack case studies of how these methods can be applied in practice and what their potential impact may be. Trading Consequences is an interdisciplinary research project between environmental historians, computational linguists, and visualization specialists. It combines text mining and information visualization alongside traditional research methods in environmental history to explore commodity trade in the 19th century from a global perspective. Along with a unique data corpus, this project developed three visual interfaces to enable the exploration and analysis of four historical document collections, consisting of approximately 200,000 documents and 11 million pages related to commodity trading. In this article, we discuss the potential and limitations of our approach based on feedback from historians we elicited over the course of this project. Informing the design of such tools in the larger context of digital humanities projects, our findings show that visualization-based interfaces are a valuable starting point to large-scale explorations in historical research. Besides providing multiple visual perspectives on the document collection to highlight general patterns, it is important to provide a context in which these patterns occur and offer analytical tools for more in-depth investigations.

DepositWorking with Text in a Digital Age

This Institute will provide 30 participants with three weeks in which (1) to develop hands on experience with TEI-XML, (2) to apply methods from information retrieval, text visualization, and corpus and computational linguistics to the analysis of textual and linguistic sources in the Humanities, and (3) to rethink not only their own research agendas but also new relationships between their work and non-specialists (e.g., an expansion in opportunities for tangible contributions and significant research by undergraduates, new collaborations that transcend boundaries of language and culture, and increased opportunities for the general public both to contribute to our understanding of the past). A two-day conference on the theme of the Institute will then follow in the summer of 2013 with an open call for contributions and will provide both a venue for and a challenge to the issues/ideas raised during the initial Institute and their importance for the digital humanities.

DepositZur Materialität der historischen Quellen im Zeitalter der digitalen Edition

Preprint, to be published in: Historische Editionen im digitalen Zeitalter. Les éditions historiques à l’ère numérique : Bestandesaufnahme und Ausblick. État des lieux et perspectives, hg. v. Pascale Sutter u. Sacha Zala, Basel (Schwabe) The essay discusses the consequence of digital methods in scholarly editing of historical sources. It comes to the following conclusions: Documents cannot be studied without taking the material features into account. Digital methods enable the editors to document those features relevant for the critical analysis of the source. The physical text bearing document is unique. It can never be reproduced but only referenced. Images, verbal descriptions, transcription and even more sophisticated reproduction techniques are only selective. In the digital edition the International Resource Identifier (IRI) of the semantic web is the best way to represent the original. Verbal descriptions have their own right against digital images and analysis methods of material science. Images are the cheapest way of editing, as they convey much information although not accessible for people lacking the necessary palaeographical skills. But computers can extract information from images too. Verbal description needs controlled vocabularies to create machine readable versions of the human readable editions.

MemberAnna Kijas

Anna E. Kijas is Head of Lilly Music Library at Tufts University. Her academic training includes master’s degrees in library and information science from Simmons College, music with a concentration in musicology from Tufts University, as well as a bachelor of arts in music literature and performance from Northeastern University. Anna is interested in the exploration and application of digital humanities tools and methods in historical (music) research, and in the application of standards, including TEI and MEI, for open access research and publishing, and the use of minimal computing. She also works on nineteenth century music topics with a focus on gender, women, and performance criticism and reception. She recently published a book on The Life and Music of Teresa Carreño (1853-1917): A Guide to Research, and has a digital project, which documents Carreño’s performance career with primary source materials, metadata, and transcriptions, as well as explores her performances and texts through data analysis and visualization tools. View Anna’s full C.V.

MemberAlberto Campagnolo

Alberto Campagnolo trained as a book conservator (in Spoleto, Italy) and has worked in that capacity in various institutions, e.g. London Metropolitan Archives, St. Catherine’s Monastery (Egypt), and the Vatican Library. He studied Conservation of Library Materials at Ca’ Foscari University Venice, and holds an MA in Digital Culture and Technology from King’s College London. He pursued a PhD on an automated visualization of historical bookbinding structures at the Ligatus Research Centre (University of the Arts, London). He was a CLIR Postdoctoral Fellow (2016-2018) in Data Curation for Medieval Studies at the Library of Congress (Washington, DC). Alberto, in collaboration with Dot Porter (SIMS, UPenn Libraries, Philadelphia, PA), has been involved from the onset in the development of VisColl, a model and tool for the recording and visualization of the gathering structure of books in codex format. Alberto has served on the Digital Medievalist board since 2014, first as Deputy Director, and as Director since 2015, and has been in the Editorial Board of the Journal of Paper Conservation since 2016.

MemberMolly Des Jardin

Molly is the Japanese Studies Librarian and liaison for Korean Studies at University of Pennsylvania Libraries, and Adjunct Assistant Professor in Penn’s East Asian Languages & Civilizations department. In addition to her work as a librarian, she taught the seminar East Asian Digital Humanities (EALC111/511) (living work-in-progress syllabus PDF at at Penn in Spring 2018. In 2014, along with Katie Rawson, Molly co-founded WORD LAB, the Penn Libraries text analysis learning community, still going strong after many years. Molly is a historian of the book in modern Japan, ranging from Meiji (1868-1912) publishing to 21st-century urban exploration publications, and has a particular focus on theories and practices of authorship. Her article “Inventing Saikaku: Collectors, Provenance, and the Social Creation of an Author” appeared in Book History v.20 (2017) and she has co-authored two book chapters with Michael P. Williams (in ACRL’s 2019 The Globalized Library and an upcoming ACTLS monograph on graphic novels in libraries).

MemberNarayanamoorthy Nanditha

I am a second-year Ph.D. student in the Department of Humanities at York University, Toronto. My research in Digital Humanities is focused on the interrogation of post-humanistic identity construction for online collectivities through Digital Activism in India through Web API use for Big Data extraction. My other project posits a computational analysis of Genocide literature in the exploration of trauma and memory structures within these narratives through sentiment analysis. I am a member of the Canadian Society for Digital Humanities (CSDH/SCHN) and York Centre for Asian Research (YCAR) . My intention is to contribute innovatively to Digital Humanities scholarship. Feel free to get in touch for collaborative ideas in DH Projects!My email address is nanditha [at] yorku [dot] ca.