• John Kausch deposited Denotation Ambiguity Scoring for Panlingual Lexical Translation Inference in the group Group logo of LinguisticsLinguistics on Humanities Commons 8 months, 1 week ago

    PanLex is a massive database of interlinked lemmas in over two thousand language varieties. Among the uses for a resource such as this is the performance of translation inference on novel translations to construct large ontologies and potentially derive statistically attested semantic universals. This is an area of research that has long relied on explicit lexicographic demarcations of multiple senses among words to infer novel translations, a design feature which is here impossible and perhaps undesired. Here is proposed a new method for measuring the cost of translation as a function of ambiguity, potentially reimagining the structure of PanLex and opening the door to its use in probabilistic inference tasks to search for novel translations. This method for measuring ambiguity and ranking attested translations is tested against the intuitions of human translators in two language varieties, English and Polish. Ultimately implicit methods of ambiguity ranking are found to be insufficient for sorting lexical entries, with no real correlation between the scoring function and the intuitions of respondents. However, at longer distance translation chains there is a chance that application of an implicit ambiguity cost metric may have merit. These results are then discussed in terms of potential confounds, and the pragmatic issues of conceiving of translation as a path search problem over a graph of linked