Workshop on the Prague Dependency Treebank


This workshop will describe the (family of) Prague Dependency Treebanks, with some theoretical background on the Functional Generative Description which comes from the Praguian linguistic tradition.

However, mostly the treebanks for Czech, English and Arabic, which follow the Prague Dependency Treebank scheme will be introduced and described. All the treebanks use a layered approach to treebanking, with four main layers: word-level layer, morphological layer, surface syntactic dependency layer, and the most detailed and advanced, the syntactic/semantic dependency layer (the "tectogrammatical" layer). The vallency lexicon(s) accompanying the treebanks will also be presented, and the links between them and the treebanked texts will be shown along with the annotation tools used during the annotation process.

Finally, some results obtained by using the treebanks as training material will be described, including but not limited to parsing, co-reference and machine translation between Czech and English.

A search tool for richly annotated dependency treebanks (PDT and 20+ others, including PTB) will be demonstrated. Participants of the workshop will get 6 month access to search all the accessible treebanks.

Slides will be available at the presenter's web page for this LSA 2011 workshop.

The workshop is free for all students and affiliates.


Jan Hajic, hajic AT ufal DOT mff DOT cuni DOT cz
Zdenka Uresova uresova AT ufal DOT mff DOT cuni DOT cz