HDS-43: Corstius, Hugo Brandt (2025) Exercises in Computational Linguistics. Doctoral thesis, Universiteit van Amsterdam.
![]() |
Text
HDS-43-Hugo-Brandt-Corstius.text.pdf - Published Version Download (9MB) |
Abstract
Not the contents of the following chapters are important, but the method used in them. For want of a better term, one could name this method the "constructive" one. If it is pleasant and useful to philosophize, discuss and talk about linguistic problems, it is surely more useful and pleasant to bring linguistic problems to a solution which is so exact that even an electronic computer can be programmed to do the work. This does not mean that the solution has to be the correct one in all possible cases, this often being an unrealizable ideal. What it does mean is that in all possible cases an assertion about the problem must be made. It is in this respect that the electronic computer has importance for the solving of linguistic problems, apart from its consistency and speed: the solution may be wrong but never vague or ambivalent. In the search for the solution, intuition, language sense or whatever one may call these nonlogical sources of insight, may play a part, but once the solution is reached, these nonlogical aids may not be employed in the application of the solution method proposed. The computer which notably lacks intuition, language sense, and so on, is a guarantee for this condition.
In this paper some computer applications to written language, usually Dutch, are dealt with. The solutions are given in the form of programs in ALGOL 60, a machine-independent programming language, not only for numerical processes, but in general fit to describe complicated processes with absolute rigour. It is to be hoped that this unambiguous language or its successor will become, also outside numerical mathematics, the language of scientific conscience.
In the first chapter the forming of Dutch words is investigated. The concept of the syllable is central here. It has proved possible to isolate the spelling syllable automatically (1.1). This program for hyphenation is used in the automatic typesetting of texts. The program was applied to a collection of newspaper words. This enabled a quantitative analysis of the three parts of the Dutch spelling syllable to be made (1.2). On this basis a generative grammar is found which enables us to produce pronounceable "Dutch-like" words, e.g., new trade names (1.3),
In the second chapter the construction of compound words is exemplified in the construction of the names of cardinal numbers, For five languages a procedure is written to translate from the digital representation of a number into its word representation (2.1). The reverse way is not difficult (2.2) and with these two procedures mutual translation of cardinal number names in different languages is realized (2,3),
In the third chapter the forming of word groups for Dutch sentences is investigated. The noun phrase serves as an example. A grammar is proposed which is used to isolate maximal noun phrases from Dutch newspaper texts (3.1). The application of the program yields two results: a collection of noun phrases with their tree structures (3,2), and a collection of sentences in which the noun phrases are compressed and which thereby have acquired a simpler structure ( 3, 3).
In the fourth chapter, sentences are investigated. Here, the meaning of those sentences is central. The part of meaning has been a growing one in the preceding three chapters: in the first chapter, the meaning of the words to be hyphenated has relevance only in very rare cases. In the second chapter, the meaning of the words, which has to be invariant during translation, is of course relevant, but its structure (the decimal representation of numbers) makes it possible to describe these meanings completely. In the third chapter, the meaning of the sentences in which we want to isolate the noun phrases presents many difficulties. We limited ourselves in the fourth chapter to a situation 'Where the meaning of segments of discourse can be operationally defined. The meaning of a word problem in algebra is "understood" if its right solution is found. We investigate word problems leading to quadratic equations.
Many problems with words, compound words, word groups and sentences are still awaiting a mechanical treatment. It was also our aim to stimulate others in this direction with the examples in this work.
Item Type: | Thesis (Doctoral) |
---|---|
Report Nr: | HDS-43 |
Series Name: | ILLC Historical Dissertation (HDS) Series |
Year: | 2025 |
Additional Information: | Originally published: 1970 (Stichting Mathematisch Centrum) |
Subjects: | Computation Language |
Depositing User: | Dr Marco Vervoort |
Date Deposited: | 26 Sep 2025 01:22 |
Last Modified: | 26 Sep 2025 01:22 |
URI: | https://eprints.illc.uva.nl/id/eprint/2388 |
Actions (login required)
![]() |
View Item |