Nathanael Erik Schweikhard deposited Developing an annotation framework for word formation processes in comparative linguistics on Humanities Commons 3 years, 6 months ago
Word formation plays a central role in human language. Yet computational approaches to historical linguistics often pay little attention to it. This means that the detailed findings of classical historical linguistics are often only used in qualitative studies, yet not in quantitative studies. Based on human- and machine-readable formats suggested by the CLDF-initiative, we propose a framework for the annotation of cross-linguistic etymological relations that allows for the differentiation between etymologies that involve only regular sound change and those that involve linear and non-linear processes of word formation. This paper introduces this approach by means of sample datasets and a small Python library to facilitate annotation.