LP-1996-13: Data-Oriented Language Processing: An Overview

LP-1996-13: Bod, Rens and Scha, Remko (1996) Data-Oriented Language Processing: An Overview. [Report]

[thumbnail of Full Text (PDF)]
Preview
Text (Full Text (PDF))
LP-1996-13.text.pdf

Download (178kB) | Preview
[thumbnail of Full Text (PS)] Text (Full Text (PS))
LP-1996-13.text.ps.gz

Download (60kB)
[thumbnail of Abstract] Text (Abstract)
LP-1996-13.abstract.txt

Download (1kB)

Abstract

Data­oriented models of language processing embody the assumption that human
language perception and production works with representations of concrete past
language experiences, rather than with abstract grammar rules. Such models
therefore maintain large corpora of linguistic representations of previously
occurring utterances. When processing a new input utterance, analyses of this
utterance are constructed by combining fragments from the corpus; the
occurrence­frequencies of the fragments are used to estimate which analysis is
the most probable one. This paper motivates the idea of data­oriented language
processing by considering the problem of syntactic disambiguation. One
relatively simple parsing/disambiguation model that implements this idea is
described in some detail. This model assumes a corpus of utterances annotated
with labelled phrase­structure trees, and parses new input by combining
subtrees from the corpus; it selects the most probable parse of an input
utterance by considering the sum of the probabilities of all its derivations.
The paper discusses some experiments carried out with this model. Finally, it
reviews some other models that instantiate the data­oriented processing
approach. Many of these models also employ labelled phrase­structure trees,
but use different criteria for extracting subtrees from the corpus or employ
different disambiguation strategies; other models use richer formalisms for
their corpus annotations.

Item Type: Report
Report Nr: LP-1996-13
Series Name: Logic, Philosophy and Linguistics (LP)
Year: 1996
Date Deposited: 12 Oct 2016 14:39
Last Modified: 12 Oct 2016 14:40
URI: https://eprints.illc.uva.nl/id/eprint/1249

Actions (login required)

View Item View Item