• Creating a Meaningful Genre Schema and Metadata using IMDb data for a Large-Scale Digital Humanities Project in Media Studies

    Author(s):
    Cindy Conaway (see profile) , Diane Shichtman
    Date:
    2020
    Group(s):
    DH2020
    Subject(s):
    Mass media--Study and teaching, Metadata, Television--Study and teaching
    Item Type:
    Presentation
    Meeting Title:
    DH2020
    Meeting Org.:
    Alliance of Digital Humanities Organizations (ADHO)
    Meeting Loc.:
    Virtual
    Meeting Date:
    July 20-25, 2020
    Tag(s):
    Descriptive metadata standards, Media studies, Television studies
    Permanent URL:
    http://dx.doi.org/10.17613/q2ra-m866
    Abstract:
    A long-term DH project examining the social networks of actors/crews across 32,500+ media items, 1938-2017. Primary source is the Internet Movie Database. IMDb is robust and provides free downloadable data, but problematic (Conaway/Shichtman DH2018). “Genres can be approached from the point of view of the industry and its infrastructure . . . aesthetic traditions . . . broader socio-cultural environment . . . audience understanding and response” (Neale). Genre on IMDb uses terms inconsistently. What it calls “genres” actually combines traditional genres, subgenres, and target audiences, allowing multiple selections. IMDb relies heavily on users for its data and much editing. “Although user editing allows a reference website such as IMDb to be up-to-date, it diffuses the responsibility for fact-checking, leading to greater uncertainty about accuracy and objectivity of information” (Wasserman). Other schemas use macro or idiosyncratic descriptors allowing an item to be included in multiple “lists.” Library of Congress uses simply Comedy, Drama, Action, etc. AFI adds “Most Thrilling” (action, horror, adventure). Netflix’s “genres, based on a complicated algorithm that uses reams of data about users' viewing habits . . . number in the tens of thousands” (Telegraph) including “Family Watch Together TV.” It has taken significant additional research and reorganization to use the data effectively for statistical analysis. While most people can tell a western from science fiction, it’s harder to deal with hybrid genres like dramedies or family movies, or genre combinations like science fiction western or action with romance. Therefore, we created a taxonomy with a variety of categories, including subjects, styles, settings, and audiences, with concise definitions for categories. If other scholars also use this schema, each media item can be described in a way that allows for effective and relatively consistent coding by multiple scholars.
    Metadata:
    Status:
    Published
    Last Updated:
    2 years ago
    License:
    All Rights Reserved
    Share this:

    Downloads

    Item Name: pptx dh2020-creating-a-meaningful-schema.pptx
      Download
    Activity: Downloads: 57