Natural Language Access to Big Data: Papers from the AAAI Fall Symposium
Dan G. Tecuci, Ulli Waltinger, Daniel Sonntag, Cochairs
November 15–17, 2013, Arlington, Virginia
Technical Report FS-14-06
Softcover version of the technical report: $30.00 softcover
(For international orders please shipping options before ordering on website.)
Today's enterprises need to make decisions based on analyzing massive and heterogeneous data sources. More and more aspects of business are driven by data, and as a result more and more business users need access to data. Offering easy access to the right data to diverse business users is of growing importance. There are several challenges that must be overcome to meet this goal. One is the sheer volume: enterprise data is predicted to grow by 800 percent in the next five years. The biggest part (80 percent) is stored in unstructured documents, most of them missing informative meta data or semantic tags (beyond date, size and author) that might help in accessing them. A third challenge comes from the need to offer access to this data to different types of users, most of whom are not familiar with the underlying syntax or semantics of the data.
Natural language interfaces and question answering systems, such as Watson, Siri, Start, or Evi, have been successfully implemented in various domains such as within the context of encyclopedic knowledge (for example, IBM's Jeopardy Challenge), in the field of energy (for example, DGRC) or in the domain of mathematics (for example, Wolfram Alpha). Following prior work in natural language access to databases (NLIDB) and question answering (QA) systems, the symposium plans to bring together people from both academia and industry to present their most recent work related to problems that leverage natural language in the context of big data, share information on their latest investigations, and exchange ideas and thoughts in order to push the research frontier towards new technologies that tackles the aspect of natural language access to large scale and heterogeneous data.