-
Automatic Extraction of Poetry from Digitally Scanned Books
- Author(s):
- John Foley (see profile)
- Date:
- 2020
- Group(s):
- DH2020
- Subject(s):
- Digital humanities, Digital libraries, Natural language processing (Computer science), Data sets, Open access publishing, Poetry
- Item Type:
- Other
- Tag(s):
- Natural language processing, Open data
- Permanent URL:
- http://dx.doi.org/10.17613/zmyr-0857
- Abstract:
- We present an automatic, learned model for the extraction of poetry from digitally scanned books. This poster highlights our recent work on poetry identification from Internet Archive books and the public resources (code, data and models) that exist as a result. We hope that this is the beginning of deeper and richer research into poetry in the digital humanities because curating custom collections of poetry should be less expensive. Additional information about our approach can be found at the home of our dataset: https://poetry.jjfoley.me.
- Notes:
- Accepted as a Poster to DH2020.
- Metadata:
- xml
- Status:
- Published
- Last Updated:
- 3 years ago
- License:
- Attribution-NonCommercial-ShareAlike
- Share this: