Abstract:
We present a system for extracting structured information from unstructured text using a combination of information retrieval, natural language processing, machine learning, and crowdsourcing. We test our pipeline by building a structured database of gun violence incidents in the United States. The results of our pilot study demonstrate that the proposed methodology is a viable way of collecting large-scale, up-to-date data for public health, public policy, and social science research.

Published Date: 2015-11-12
Registration: ISBN 978-1-57735-740-7
DOI:
10.1609/hcomp.v3i1.13253