Using Clustering Methods for Discovering Event Structures

Cosmin A Bejan, Sanda Harabagiu

In the process of understanding specific situations, domain experts manually build event models that provide the infrastructure for reasoning with events and for simulating event scenarios. Such kind of models are also encoded in document collections in which each document is represented as a sequence of events. When describing situations in texts, events do not operate independently, but rather they are inter-related with other events from the same scenario. For example, in a crime scenario, a person is accused of a crime, then that person is arrested and interrogated after which a trial is held. We define an event structure as a collection of events that interact with each other in a specific situation. Extracting event structures from texts allows us to perform various forms of inference over events. For example, given an event e from an event structure, we can determine which events are likely to happen after e happened. Another example is to compute the probability that an event e from an event structure s is likely to happen given the fact that a set of events from s and disjoint from e already happened in a text t. In this specific example, the event e is not required to be present in t. In general, if we know what events interact with each other in an event structure, we can build more complex inference models dealing with causality, intentionality or temporality of events. Our goal is to provide a method that automatically extracts event structures from texts. In order to build event structures we need (1) to determine the set of events that belong to the same event structure and (2) to establish what relations exist between the events from the same structure. In this abstract, we describe the theoretical frameworks for solving both tasks, but we detail more the first task for which we also show our preliminary results. The motivation of this work is in the same spirit with the work performed for solving Topic Detection and Tracking tasks (Allan 2002). However, instead of considering clusters as topically related bag of words like in a classic topic modeling approach, our goal is to build structured event representations and to interpret the event interactions that exist in these representations.

Subjects: 13. Natural Language Processing; 13.1 Discourse

Submitted: Apr 9, 2008

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.