PP-2008-24: The Data-Oriented Parsing Approach: Theory and Application

PP-2008-24: Bod, Rens (2008) The Data-Oriented Parsing Approach: Theory and Application. [Report]

[thumbnail of Full Text]
Text (Full Text)

Download (384kB) | Preview
[thumbnail of Abstract] Text (Abstract)

Download (1kB)


A corpus-based parsing approach that has been quite successful in
various fields of AI, is known as Data-Oriented Parsing or DOP. DOP
was originally developed as an NLP technique but has been generalized
to music analysis, problem-solving and unsupervised structure
learning. The distinctive feature of the DOP approach, when it was
first presented, was to model sentence structures on the basis of
previously observed frequencies of sentence-structure fragments,
without imposing any constraints on the size of these
fragments. Fragments include, for instance, subtrees of depth 1
(corresponding to context-free rules), as well as entire trees.

The DOP approach has been generalized to other modalities, including
music analysis and problem solving. It has turned out that
probabilistic corpus-based parsing outperforms deterministic
rule-based processing not only for language but also for melodic
analysis and problem solving. Our goal for this Chapter is therefore
to present the DOP approach from a multi-modal perspective. But in
order to do, it is convenient to first explain DOP for language
processing, after which we discuss an integrated DOP model that
unifies the different modalities. We will go into the various
computational issues and show how the model can be tested against
hand-annotated corpora. Finally, we will discuss shortcomings of this
supervised approach, and present some results of recent work that
extends DOP towards unsupervised learning.

Item Type: Report
Report Nr: PP-2008-24
Series Name: Prepublication (PP) Series
Year: 2008
Uncontrolled Keywords: unifiying model DOP
Depositing User: Rens Bod
Date Deposited: 12 Oct 2016 14:36
Last Modified: 12 Oct 2016 14:36
URI: https://eprints.illc.uva.nl/id/eprint/298

Actions (login required)

View Item View Item