• Leveraging "The Wisdom of the Crowds" for Efficient Tagging and Retrieval of documents from the Historic Newspaper Archive

    Project Director(s):
    Haimonti Dutta
    Author(s):
    William Chan, Haimonti Dutta, Megha Gupta, Manoj Pooleery
    Date:
    2013
    Group(s):
    Data Rescue
    Subject(s):
    Library and information science
    Item Type:
    White paper
    Institution:
    Columbia University
    Tag(s):
    NEH White papers, Digital Humanities Start-Up Grants, NEH Digital Humanities
    Permanent URL:
    http://dx.doi.org/10.17613/M6FD1P
    Abstract:
    Computers may have defeated humans in chess and arithmetic, but there are many areas where the human mind still excels such as visual cognition and language processing (Comm. of ACM, Vol 52, No 3, March 2009). If one mind is good, it has been argued that several minds are likely to be superior in certain tasks than individuals and even experts. This project aims to leverage the wisdom of the crowds (von Ahn, 2008) to collaboratively tag historical newspaper articles in the holdings of the New York Public Library (NYPL). Patrons and scholars will be encouraged to generate custom tags for articles they read and use often; these will be integrated into a meta-data library and evaluated for their contribution to improving retrieval performance. The text in the newspaper articles along with user-generated tags will be subjected to statistical analysis and machine learning for automatic categorization.
    Notes:
    A study of user-generated subject tagging to improve search capabilities for large-scale digital archives of humanities materials, using the historic newspaper collections of the New York Public Library.
    Metadata:
    Status:
    Published
    Last Updated:
    2 years ago
    License:
    Attribution-NonCommercial
    Share this:

    Downloads

    Item Name:pdf hd-51153-10.pdf
     Download View in browser
    Activity: Downloads: 112