Ling7800: Advanced Computational Linguistics: Lexical Semantics
Time and Location: Tuesday 9:30-10:45 Stadium ITS classroom 308 (gate 7), Thursday 9:30-10:45 Hellems Linguistics Dept Lounge
Assessment: Presentation of two papers, two homeworks and a term project.
Office Hours: Thursday 11:00 - 12:00 and Friday 5-6
One of the great challenges of Natural Language
Processing is the multitude of choices that language gives us for
expressing the same thing in different ways. This is obviously true when
taking other languages into consideration - the same thought can be
expressed in English, French, Chinese or Russian, with widely varying
results. But it is also true when considering a single language such as
English. Light verb constructions, nominalizations, idioms, slang,
paraphrases, and synonyms all give us myriads of alternatives for "coining
With respect to other languages, one solution that has
been often touted is that of an "interlingua:" a universal, language neutral
semantic representation that all languages could be mapped onto. This
approach has an immediate appeal, since it would obviate the need for
specific translation systems for every possible pair of languages.
Instead, it would only be necessary to build systems for each individual
language that can produce the interlingua representation from an analysis
of the sentences in the language, and that can generate fluent sentences
from interlingua representations. As desirable as this may seem, and in
spite of the tremendous effort that has gone into this quest, the
realization of a suitable "interlingua" has proven to be elusive.
The students in this course will be encouraged to form their own opinion of the
feasibility of an "interlingua." We will
explore in depth alternative styles of semantic representations, and
compare and contrast their contributions to finding a useful, common
semantic representation that can bridge lexical and structural gaps both
mono-lingually and multi-lingually. We will look particularly closely at
Question Answering and Recognizing Textual Entailments as NLP applications
that are in dire need of such bridges. We will also explore alternative
styles of semantic annotations and their cross-linguistic application.
Suggested Schedule and Readings - Open to Modification
Introduction and Module 1: the Lexical Semantics of Verbs
Jan 16, 18, 23, 25, 30, Feb 1, 6, 8, 13, 15, 20, 22, 27
- Jan 16, FrameNet (also Jan 30)
Fillmore, C. J. 1968 "The Case for Case" in E. Bach and R.T. Harms, eds.
Universals in Linguistic Theory, 1-88. New York: Holt,
Rinehart and Winston. Section 3. paper
Fillmore, Charles J. and Atkins, B. T. S. (1998): FrameNet and lexicographic relevance, Proceedings of the First International Conference on Language Resources and Evaluation, Granada, Spain.(The quality of the print may be compromised, as the paper was scanned; as such, it is also a very large file.) paper on this page
Fillmore et al 2001 "Building a large lexical databank which provides deep
Fillmore, C.J. 1977 Scenes-and-Frames Semantics in Fundamental Studies in
Computer Science: Linguistics Structures Processing, Ed. Antonio Zampolli,
pp. 55-81 paper on this page .
Jan 16, Lexical Conceptual Structure Jackendoff, R.S. 1976 Towards an Explanatory Semantic Representation, Linguistic Inquiry, 7:1, pp. 89-150. paper Second half paper
- Jan 18, Proto-roles Dowty D.R 1991 Thematic Proto-Roles and Argument Selection. Language 67: 547-619 sections 1-7 paper
- Jan 18 & 23 PropBank: Martha Palmer, Dan Gildea, Paul Kingsbury, The Proposition Bank: An Annotated Corpus of Semantic Roles,
draft of paper submitted to Computational Linguistics, December, 2003.
- Jan 18 & 23 Levin Classes and VerbNet:
Levin, B. English Verb Classes: A Preliminary Classification Introduction,MIT Press, pp. 1-23, 1990., paper
Karin Kipper, Anna Korhonen, Neville Ryant, and Martha Palmer. Extending VerbNet with Novel Verb Classes. Fifth International Conference on Language Resources and Evaluation (LREC 2006). Genoa, Italy. June, 2006.
Karin Kipper, Anna Korhonen, Neville Ryant, and Martha Palmer. Extensive Classifications of English verbs. Proceedings of the 12th EURALEX International Congress. Turin, Italy. September, 2006.
- Jan 25 & 30 Hierarchical Sense Distinctions
Martha Palmer, Hoa Dang and Christiane Fellbaum, Making Fine-grained and Coarse-grained sense distinctions, both manually and automatically, Journal of Natural Language Engineering, draft of submitted paper.
- Feb 1 The Generative Lexicon
Pustejovsky, The Generative Lexicon
Pustejovsky J 1991, The Generative Lexicon, ComputationaI Linguistics, Volume 17, Number 4, December.
- Feb 6 More FrameNet
Baker & Ruppenhofer 2002 "FrameNet's Frames and Levin's Verb Classes"
- Feb 8, Interlinguas
Dorr, Bonnie, Eduard Hovy and Lori Levin,(2004)
Machine Translation: Interlingual Methods , Encyclopedia of Language and
Linguistics 2nd edition" Brown, Keith (eds.).
- Feb 13 - no class, Jan 19th talk instead
- Feb 15 Building Verb Meanings Jill Duffield
Rappaport M. and B.Levin 1998 "Building Verb Meanings" in Butt, Geuder,
eds. The Projection of Arguments: Lexical and Compositional
Factors, CSLI Publications paper
- Feb 20 Event Structure and TimeBank Steven Bethard
Pustejovsky, J., Castaqo, J., Ingria, R., Saurm, R., Gaizauskas, R.,
Setzer, A. and Katz, G. TimeML: Robust Specification of Event and
Temporal Expressions in Text. In Proceedings of the Fifth International
Workshop on Computational Semantics (IWCS-5), 2003
- Feb 22 Visit with Fernando Pereira
- Feb 27 Event Semantics Kevin Cohen
Davidson D. 1967. "The Logical Form of Action Sentences" Reprinted in Davidson
D: Essays on Actions and Events, Oxford University Press
Parsons T. 1990 Events in Semantics of English . MIT Press, Boston
- March 1 Event Coreference Miriam Eckert
Event Coreference for Information Extraction, Humphreys et al, 1997
- March 6 Metaphors Vicky Lai
Zachary J. Mason
CorMet: A Computational, Corpus-Based Conventional Metaphor Extraction System,
Computational Linguistics, Volume 30, Number 1, March 2004.
- March 8, 13 Talmy Vicky Lai
Toward a Cognitive Semantics - Volume 2: Typology and Process in Concept Structuring (Language, Speech, and Communication), Chapter 1 Lexicalization Patterns
Module 2: Semantic Representations in NLP Applications - recent
March 15, April 3, 5
- NAACL03 Workshop on Text Meaning - Helen Johnson
HLT-NAACL 2003 ACL Anthology Web Page
W03-0902:Schubert L, Tong M. "Extracting and Evaluating General World Knowledge from the
Brown Corpus" in Proceedings of the NLT-NAACL 2003 Workshop.
Proceedings of the HLT-NAACL 2003 Workshop on Text Meaning
W03-0901:Clark P, Harrison P., and J. Thomson "A Knowledge-Driven Approach to Text Meaning Processing", in Proceedings of the HLT-NAACL 2003 Workshop on Text Meaning. slides
- Parameterized Action Representations
Karin Kipper and Martha Palmer,(2000),
Representation of Actions as an Interlingua,
Proceedings of the Third Workshop on Applied Interlinguas,
held in conjunction with ANLP-NAACL 2000.
Module 3. Machine Learning
March 20, 22, Dima
Module 4: Word Sense Disambiguation
April 10 (Dima)
- Schutze "Automatic Word Sense Discrimination" 1998
- McCarthy "Finding Predominant Word Senses in Untagged Text". ACL2004
Eneko Agirre, David Martínez, Oier López de Lacalle and Aitor Soroa, Two graph-based algorithms for state-of-the-art WSD, In the Proceedings of EMNLP06, held at ACL06, Sydney, Australia, pp. 585-593 paper
Module 5: Ontolgies in NLP
Apr 12 (Also Projects), 17, 19
- Formal Ontology and Information Systems, Nicola Guarino,
In: Formal Ontology in Information Systems. Proceedings of FOIS'98, Trento, Italy, June 6-8, 1998. IOS Press, Amsterdam, 1998. Trevor Pincock
- Medical WordNet Christiane Felbaum, Udo Hahn, Barry Smith,
Towards new information resources for public health - From WordNet to Medical WordNet", Journal of Biomedical Informatics 39 (2006) 321-332.
- Roberto Navigli, Meaningful Clustering of Senses Helps Boost Word Sense Disambiguation Performance, In the Proceedings of ACL2006, Sydney, Australia paper
Navigli, Roberto and Paola Velardi and Alessandro Cucchiarelli and Francesca Neri. ``Extending and Enriching WordNet with OntoLearn" In: Proceedings of the Second Global WordNet Conference , pp. 279-284, Brno, Czech Republic, January 20-23, 2004. paper
For more papers, see here
- CoreLex Paul Buitelaar CoreLex: An Ontology of Systematic Polysemous ClassesG In: Formal Ontology in Information Systems. Proceedings of FOIS'98, Trento, Italy, June 6-8, 1998. IOS Press, Amsterdam, 1998. Find paper here Trevor Pincock
Paul Buitelaar, Philipp Cimiano, Bernardo Magnini Ontology Learning from Text: An Overview In: Paul Buitelaar, Philipp Cimiano, Bernardo Magnini (eds.) Ontology Learning from Text: Methods, Evaluation and Applications Frontiers in Artificial Intelligence and Applications Series, Vol. 123, IOS Press, July 2005.
Paul Buitelaar, Philipp Cimiano, Stefania Racioppa, Melanie Siegel Ontology-based Information Extraction with SOBA. In: Proc. of LREC, Genoa, Italy, May 2006.
find papers here
Module 6: Empirical Studies of Lexical Semantic Phenomena
April 24, 26 - NAACL Conference, No classes
May 1, 3,
- Verb Particle Constructions Jena Hwang
Villavicencio, A. (2003). Verb-particle constructions and lexical resources. In Francis Bond, Anna Korhonen, Diana McCarthy, and Aline Villavicencio,editors, Proceedings of the ACL 2003 Workshop on Multiword Expressions: Analysis, Acquisition and Treatment, pages 57-64, Sapporo, Japan.
Villavicencio, A. (2003). Verb-Particle Constructions in the World Wide Web. Proceedings of the ACL-SIGSEM Workshop on the Linguistic Dimensions of Prepositions and their use in Computational Linguistics Formalisms and Applications. Toulouse, France, 2003.
Villavicencio, A., A. Copestake, B. Waldron, F. Lambeau (2004). The Lexical Encoding of MWEs. In T.Tanaka, A. Villavicencio, F. Bond, A. Korhonen eds. Proceedings of the ACL 2004 Workshop on Multiword Expressions: Integrating Processing. Barcelona, 2004.
- Taxonomy Induction Jena Hwang
Rion Snow, Daniel Jurafsky and Andrew Y. Ng,
Semantic Taxonomy Induction from Heterogenous Evidence,pp 801-808,
Module 7: Additional Reading, Inducing Verb Classes
Paola Merlo; Suzanne Stevenson; Vivian Tsang; Gianluca Allaria
A Multilingual Paradigm for Automatic Verb Classification, In the Proceedings of ACL02, Philadelphia, PA,July, 2002 paper
- Sabine Schulte im Walde,
The Induction of verb frames and verb classes from corpora, to appear as Chapter 61 in Corpus Linguistics: An International Handbook.
- Sabine Schulte im Walde
Experiments on the Automatic Induction of German Semantic Verb Classes
Computational Linguistics 32(2):159-194, 2006, paper
- Merlo P., E. Esteve Ferrer (2006) "The Notion of Argument in PP Attachment", to appear in Computational Linguistics 32(2). paper
Module 7: Project Presentations
Final Exam time, Wed, May 9 10:30 - 1:30