A Probabilistic Approach to Japanese Lexical Analysis

Virginia Teller

We report on a project to develop a stochastic lexical analyzer for Japanese and to compare the accuracy of this approach with the results obtained using conventional rule-based methods. In contrast with standard, knowledge intensive methods, the stochastic approach to lexical analysis uses statistical techniques that are based on probabilistic models. This approach has not previously been applied to unrestricted Japanese text and promises to yield insights into word formation and other morphological processes in Japanese. An experiment designed to assess the accuracy of a simple statistical technique for segmenting hiragana strings showed that this method was able to perform the task with a relatively low rate of error.

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.