Professor Manabu Okumura

Precision and Intelligence Laboratory
Advanced Information Processing Division: Okumura Group
Tokyo Institute of Technology

NOTE: Japanese page is here.

Section: Natural Language Processing, Automated Text Summarization, Computer Assisted Language Learning, Text Data Mining

Objective: Development of the technique of natural language processing and its application systems

Current Topics:

Incremental Language Understanding Model(Robust Semantic and Discourse Processing)
Automated Text Summarization
Development of Communication Assistive Technology for People with Disabilities
Animation Control through Natural Language Understanding

Major

Natural Langauge Processing, Natural Language Understanding,

Research Interest

Our research focuses on the development of a real-time and user-friendly speech dialogue system. We are currently working on fundamental issues for it: incremental interpretation, robustness improvement, and lexical acquisition.

1. Incremental interpretation model for Japanese: In a real-time dialogue system, texts need to be incrementally analyzed while they are inputed. We present such an interpretation model that uses syntactic, semantic, contextual, and commonsense knowledge cooperatively. In particular, those systems are under development which performs syntactic and semantic analysis in integrated way using case frame information of verbs, and which resolves word sense ambiguity and finds coherence between sentences simultaneously in terms of associativity between words.
2. Ill-formed input analysis: A user-friendly dialogue system should be robust and flexible in that it can analyze any user inputs with various constructions and unlimited vocabulary. Therefore, it needs to be able to cope with ill-formed sentences. In Japanese ellipses are typical and so for a start we present a method to fill the gaps for various types of ellipses.
3. Semi-automatic lexical acquisition: Large scale dictionaries are indispensable for practical natural language understanding systems. However, because it is considered to be difficult to construct a large lexicon by hand, it is an important challenge to extract lexical information semi-automatically from text data such as large corpora and machine readable dictionaries. Now those projects are in progress where information of associative relation between verbs and between nouns is extracted from a machine readable dictionary of verbs and adjectives respectively. Semi-automatic lexical acquisition from bilingual tagged corpora is also planned.

Publications

``Towards Incremental Disambiguation with a Generalized Discrimination Network''; Proc. of the Eighth National Conference on Artificial Intelligence, pp.990-995(1990).
``Incremental Analysis of Japanese Dependency Relations with a Generalized Discrimination Network''; Proc. of the second Pacific Rim International Conference on Artificial Intelligence, pp.787-793(1992).
``Towards Japanese Ellipsis Resolution with a Generalized Discrimination Network''; AI'92 Proc. of the 5th Australian Joint Conference on Artificial Intelligence, A. Adams and L. Stirling eds.), World Scientific, pp.212-217, 1992.
``Word Sense Disambiguation and Text Segmentation Based on Lexical Cohesion''; Proc. of the 15th International Conference on Computational Linguistics, pp.755-761, 1994. Postscript File (60 K)
``Word Sense Disambiguation by Marker Passing on Very Large Semantic Networks''; Proc. of the Natural Language Processing Pacific Rim Symposium'95, pp.71-76, 1995. Postscript File (57 K)
``Zero Pronoun Resolution in Japanese Discourse Based on Centering Theory''; Proc. of the 16th International Conference on Computational Linguistics, pp.871-876, 1996. Postscript File (50 K)

BACK

Okumura Lab. Homepage 1999-2002 Copyright Okumura Lab.
Please send any requests and comments to the following address:
webmaster[at mark]lr.pi.titech.ac.jp