================================
SemLink VerbNet/PropBank Mapping
================================
Version: 1.1
URL: http://verbs.colorado.edu/~edloper/semlink/
Citation: Edward Loper, Szu ting Yi, and Martha Palmer. 2007.
Combining lexical resources: Mapping between propbank and
verbnet. In Proceedings of the 7th International Workshop
on Computational Linguistics, Tilburg, the Netherlands.
Overview
--------
The SemLink mapping between VerbNet and PropBank consists of two
parts: a lexical mapping and a token mapping. The lexical mapping
specifieds the potential mappings between PropBank and VerbNet for a
given word; but it does not specify which of those mappings should be
used for any given occurrence of the word. The token mapping provides
the correct mapping between arguments for every predicate in the
PropBank corpus.
In some cases, a predicate from PropBank will not exist in VerbNet;
will not exist in the correct sense; or will have arguments without
corresponding roles in VerbNet. In these cases, the VerbNet role is
listed as 'None' and the argument is left in its unmapped (ARGn) form.
Type Mapping
------------
The type mapping is provided as a single xml file, containing entries
of the form:
Each entry describes a single verb lemma, and contains one
or more entries. Each entry describes the mapping
between arguments for a specific (PropBank roleset, VerbNet class)
pair, using one or more entries. Each entry describes
the mapping between PropBank ARGn labels and VerbNet thematic roles
for a single argument role. In the above example, when the "muzzle"
verb is used in the sense described by VerbNet class 9.9, the PropBank
and VerbNet roles map as follows:
================================
PropBank VerbNet
--------------------------------
ARG0 <-> Agent
ARG1 <-> Destination
ARG2 <-> Theme
================================
Token Mapping
-------------
The token mapping is available in two forms:
- vnprop.txt -- contains just VerbNet role labels
- vnpbprop.txt -- contains both VerbNet and PropBank role labels
Both files use the same format as PropBank's prop.txt file. In
particular, each line describes a single predicate and its arguments.
The columns are as follows:
wsj-filename sentence terminal tagger verb inflection arguments...
Where:
- 'wsj-filename' is the name of the file in merged penn treebank, wsj
section
- 'sentence' is the number of the sentence in the file (starting with 0)
- 'terminal' is the number of the terminal in the sentence that is
the location of the verb. note that the terminal number counts
empty constituents as terminals and starts with 0. This will hold
for all references to terminal number in this description.
- 'tagger' is the name of the annotator who performed the mapping.
- 'verb' is a token identifying the verb's PropBank roleset and VerbNet
class. It has the form ;VN= where is a
PropBank roleset and is a VerbNet class number.
- 'inflection' consists of 5 characters representing person, tense,
aspect, voice, and form of the verb, respectively. See the PropBank
documentation for details.
- 'arguments...' is a string representing the annotation associated
with a particular argument or adjunct of the proposition. Each
proplabel is dash '-' delimited and has the following columns
1) column for the 'syntactic relation'. See the PropBank documentation
for details.
2) column for the 'label'. In vnprop.txt, this will consist of a
VerbNet thematic role label (Agent, Patient, etc); or a PropBank
role (ARG0, ARG1, etc) if the role does not have an appropraite
mapping target in VerbNet. In vnpbprop.txt, this will have the
form "ARG[]", where is a PropBank role number and
is a VerbNet thematic role; or simply "ARG" if there is
no appropriate mapping target. The label "rel" is used to mark
the position of the relation word (i.e., the verb).
3) column for feature. See the PropBank documentation for details.
History
-------
Release 1.0 contained a bug that caused some of the verbs that are not
contained in VerbNet at all to get improper annotations in the
vnpbprop.txt file. This bug has now been fixed.