My primary research interest is in the representation of semantic information and its use in natural language processing applications - computational lexical semantics. The difficulty of achieving adequate hand-crafted semantic representations has limited the field of natural language processing to applications that can be contained within well-defined subdomains. The only escape from this limitation will be through the use of automated or semi-automated methods of lexical acquisition, which presuppose a link between a distributional analysis of language and a well-founded theory of semantic representation.
I began with a thorough study of the usefulness of Lexical Conceptual Structures, [Jackendoff72,90] as a basis for computational lexical semantics, [Palmer81, Palmer83, Palmer90a]. The outcome of this study was in the Pundit/Kernel text processing system where the semantic representations proved extremely effective for reference resolution, temporal analysis and recovery of implicit information, [Palmer et al 85, Dahl et al 86]. This system was internationally recognized as providing path-breaking in-depth coverage of semantics and pragmatics, [Palmer et al 93]. However, porting the system to new domains revealed the limitations of the approach; primarily the fragility of the parser and the major time commitment required to create separate hand-crafted lexical entries for every slight sense variation of a particular lexical item, [Palmer 90b].
I am now investigating verb classifications such as Levin's verb classes, [Levin93], and WordNet, [Miller90, Miller91]. I believe that sets of semantic components can be associated with lexical items, in particular primary senses of verbs, that will account for most of their syntactic behavior. This can be implemented as sets of features, which provides a more flexible representation than the rule-based Lexical Conceptual Structures, allowing for more robust processing and best partial matches, [Palmer and Wu95]. In addition, this approach should be more amenable to empirical methods, since a distributional analysis of syntactic frames should provide critical information regarding a verb's semantic classification, not just for English but for all languages, [Dang & Palmer99]. These semantic classifications, although potentially quite diverse, should share key cross-linguistic semantic components as suggested by Talmy [Talmy90] and Jackendoff.
This research, supported by NSF grant 9800658, specifically addresses questions of word sense distinctions with respect to verbs, and how regular extensions of meaning can be achieved through the adjunction of particular syntactic phrases. My students and I are developing VerbNet, based on a bilingual Korean/English lexicon, as well as a bilingual Portuguese/English lexicon, that make explicit the semantic components, argument structure, and sets of syntactic frames associated with individual lexical items. Many of these semantic components are cross-linguistic, [Palmer et al 96, Palmer et al 98a]. The lexical items in each language form natural groupings based on the presence or absence of semantic components and the ability to occur or not occur within particular syntactic frames. These bilingual lexicons are being implemented as Feature-based Lexicalized Tree-Adjoining Grammars, [Bleam et al 98], [Xia et al 98], but they are intended to be independent of particular syntactic frameworks and should map readily onto many widely used formalisms, including CCG, HPSG, LFG, and GB. The English entries are mapped directly onto English WordNet senses.
Levin classes, although a valuable starting point for VerbNet, do not currently provide information that is complete enough or precise enough to inform lexical entries or to serve as a clustering Gold Standard. Both Levin classes and WordNet have limitations that impede their utility as general classification schemes. We have developed a refinement of Levin classes, intersective Levin classes, which are more fine-grained and which exhibit more coherent sets of syntactic frames and associated semantic components [Palmer et al 97]. Certain syntactic frames indicate the adjunction of prepositional phrases or adverbs that provide a regular extension of meaning to the core sense of many verbs. For example, we associate the {\it directed motion} feature with path prepositional phrases for manner of motion verbs (and other classes, such as sound emission). We have preliminary indications that the membership of our intersective sets is more compatible with WordNet classifications than the broader Levin classes, allowing us to attribute the semantic components and associated sets of syntactic frames to specific WordNet senses as well, thus enriching the WordNet representation, and providing explicit criteria for word sense disambiguation. We are also finding interesting class correspondences between English and Portuguese, [Dang et al 98].
Related research interests include logic programming, artificial intelligence, multi-lingual information extraction and retrieval, and machine translation [Palmer et al 98b, Palmer, Rambow & Nasr98].
VerbNet also forms the basis of Parameterized Action Representations which are used for natural language interaction with virtual humans, work that is done at the Human Simulation and Modeling Center.
Visiting Associate Professor, Department of Computer and Information Science
mailto:mpalmer@cis.upenn.edu Back to the Martha's homepage