Scholarly Big Data: AI Perspectives, Challenges, and Ideas
Cornelia Caragea, C. Lee Giles, Narayan Bhamidipati, Doina Caragea, Sujatha Das Gollapalli, Saurabh Kataria, Huan Liu, Feng Xia, Organizers
Technical Report WS-15-13
Softcover version of the technical report: $25.00 softcover
(For international orders please shipping options before ordering on website.)
Academics and researchers worldwide continue to produce large numbers of scholarly documents including papers, books, technical reports, and associated data such as tutorials, proposals, and course materials. For example, PubMed has over 20 million documents, 10 million unique names and 70 million name mentions. Google Scholar has many millions more, it is believed. Understanding how at scale research topics emerge, evolve, or disappear, what is a good measure of quality of published works, what are the most promising areas of research, how authors connect and influence each other, who are the experts in a field, and who funds a particular research topic are some of the major foci of the rapidly emerging field of Scholarly Big Data.
Digital libraries, repositories, databases, Wikipedia, funding agencies and the web have become a medium for answering such questions. For example, citation analysis is used to mine large publication graphs in order to extract patterns in the data (for example, citations per article) that can help measure the quality of a journal. Scientometrics is used to mine graphs that link together multiple types of entities: authors, publications, conference venues, journals, institutions, in order to assess the quality of science and answer complex questions such as those listed above. The recent developments in Artificial Intelligence technologies make it possible to transform the way we analyze research publications, funded proposals, patents, on a web-wide scale.