PP-2008-24: Bod, Rens (2008) The Data-Oriented Parsing Approach: Theory and Application. [Report]
Preview |
Text (Full Text)
PP-2008-24.text.pdf Download (384kB) | Preview |
Text (Abstract)
PP-2008-24.abstract.txt Download (1kB) |
Abstract
A corpus-based parsing approach that has been quite successful in
various fields of AI, is known as Data-Oriented Parsing or DOP. DOP
was originally developed as an NLP technique but has been generalized
to music analysis, problem-solving and unsupervised structure
learning. The distinctive feature of the DOP approach, when it was
first presented, was to model sentence structures on the basis of
previously observed frequencies of sentence-structure fragments,
without imposing any constraints on the size of these
fragments. Fragments include, for instance, subtrees of depth 1
(corresponding to context-free rules), as well as entire trees.
The DOP approach has been generalized to other modalities, including
music analysis and problem solving. It has turned out that
probabilistic corpus-based parsing outperforms deterministic
rule-based processing not only for language but also for melodic
analysis and problem solving. Our goal for this Chapter is therefore
to present the DOP approach from a multi-modal perspective. But in
order to do, it is convenient to first explain DOP for language
processing, after which we discuss an integrated DOP model that
unifies the different modalities. We will go into the various
computational issues and show how the model can be tested against
hand-annotated corpora. Finally, we will discuss shortcomings of this
supervised approach, and present some results of recent work that
extends DOP towards unsupervised learning.
Item Type: | Report |
---|---|
Report Nr: | PP-2008-24 |
Series Name: | Prepublication (PP) Series |
Year: | 2008 |
Uncontrolled Keywords: | unifiying model DOP |
Depositing User: | Rens Bod |
Date Deposited: | 12 Oct 2016 14:36 |
Last Modified: | 12 Oct 2016 14:36 |
URI: | https://eprints.illc.uva.nl/id/eprint/298 |
Actions (login required)
View Item |