Automatic construction of Korean verbal type hierarchy using Treebank




The lexical information of verbal lexemes, such as verbs and adjectives, plays an important role in syntactic parsing, because the structure of a sentence mainly hinges on the type of verbal lexemes. The question we address in this research is how to acquire the argument structure (henceforth ARG-ST) of verbal lexemes in Korean. It is well known that manual build-up of type hierarchy usually cost too much time and resources, so an alternative method, namely automatic collection of relevant information is much more preferred. This paper proposes a procedure to automatically collect ARG-ST of Korean verbal lexemes from a Korean Treebank. Specifically, the system we develop in this paper first extracts lexical information of ARG-ST of verbal lexemes from a 0.8 million graphic word Korean Treebank in an unsupervised way, checks the hierarchical relationship among them, and builds up the type hierarchy automatically. The result is written in an HPSG-style annotation, thus making it possible to readily implement the result in an HPSG-based parser for Korean. Finally, the result is evaluated with reference to two Korean dictionaries and also with respect to a manually constructed type hierarchy.


