MoL-2023-21: Wildenburg, Franciscus Cornelis Lambertus (2023) Investigations into Semantic Underspecification in Language Models. [Report]
Text
MoL-2023-21.text.pdf - Published Version Download (870kB) |
Abstract
Several (position) papers have drawn attention to the challenges semantic underspecification may bring to modern language models, yet relatively little research has been done on this topic. We contribute to this area of research by presenting DUST, a dataset of underspecified sentences annotated with their domain of underspecification. Using this dataset and three experiments using prompts, language model perplexity, and diagnostic classifiers, we study the way modern language models process sentences containing semantic underspecification. We find that the ability of language models to recognize underspecification does not correlate with some commonly used metrics for language models, and that a fine-grained approach to underspecification could greatly benefit the research community.
Item Type: | Report |
---|---|
Report Nr: | MoL-2023-21 |
Series Name: | Master of Logic Thesis (MoL) Series |
Year: | 2023 |
Subjects: | Language Logic |
Depositing User: | Dr Marco Vervoort |
Date Deposited: | 26 Sep 2023 13:09 |
Last Modified: | 26 Sep 2023 13:09 |
URI: | https://eprints.illc.uva.nl/id/eprint/2271 |
Actions (login required)
View Item |