Context-Sensitive Statistics for Improved Grammatical Language Models

Eugene Charniak, Glenn Carroll

We develop a language model using probabilistic context-free grammars (PCFGs) that is "pseudo context-sensitive" in that the probability that a non-terminal N expands using a rule T depends on N’s parent. We give the equations for estimating the necessary probabilities using a variant of the inside-outside algorithm. We give experimental results showing that, beginning with a high-performance PCFG, one can develop a pseudo PCSG that yields significant performance gains. Analysis shows that the benefits from the context-sensitive statistics are localized, suggesting that we can use them to extend the original PCFG. Experimental results confirm that this is both feasible and the resulting grammar retains the performance gains. This implies that our scheme may be useful as a novel method for PCFG induction.

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.