Here is a short abstract of my thesis. A
somewhat
longer
synopsis is also available.
The Minimum Description Length Principle
and Reasoning under Uncertainty
Peter Grünwald
ILLC-Dissertation Series nr. DS-1998-03
Most research reported in the thesis concerns the so-called
Minimum Description Length (MDL) Principle. The MDL
Principle is a general method for inductive
inference. The fundamental idea behind the MDL
Principle is that any regularity in a given set of data can be used to
compress the data, i.e. to describe it using fewer symbols
than needed to describe the data literally. The more regularities
there are in the data, the more we can compress it. This leads to the
view (which is just a version of Occam's famous razor) that the more
we can compress a given set of data, the more we can say we have
learned about the data.
The thesis starts with an
introduction to MDL intended for a general audience. It continues with some new
theoretical research concerning MDL and related
statistical methods, like Bayesian Statistics and the Maximum Entropy
Principle. This is followed by a practical part, in which results of
applying MDL on real-world data sets are reported. It ends with a part
on non-monotonic reasoning and Artificial Intelligence's frame problem, not directly related to MDL.
Question: Can we use simplistic models?
The main question investigated in the thesis is the following:
Under what circumstances is it safe or even advisable to use overly simplistic models for the data at hand?
The result of statistical analysis of a given
set of data is nearly always a model for this data that is really a
gross simplification of the process that actually underlies these
data. Nevertheless, such overly simple models are often succesfully
applied in practice. How is this
possible? And can we identify situations in which this is
possible and situations in which it is not?
These are
the main questions we investigate. Though our research is in the
framework of the MDL Principle, the results are relevant for
other statistical methods also.
Answer: 'Safe' and 'Risky' Statistics
Briefly, we reach the following
conclusion: overly simple models can be applied to make predictions
and decisions in two different ways: a `safe' one and a `risky'
one. If a model is used in the `safe' way, then it will be `reliable'
in the following sense: the model gives a correct impression of the
prediction error that will be made if the model is used to predict
future data, even in the case where the model is a gross
simplification of the process that truely underlies the given data. If
the model is used in the `risky' way, there is no such guarantee
(nevertheless, such usage of a model often makes sense). We state and
prove several theorems which show that incorrect models can be
`reliable' in the sense indicated above under many circumstances. The
concept of 'reliability' has several applications. We mention three:
- It can be used to extend the scope of the MDL Principle to arbitrary
model classes and error measures.
- It allows to distinguish between
'safe' and 'risky' versions of the method of Maximum Entropy (which is
closely related to MDL), thereby explaining both successes and
failures of this controversial principle.
- It provides new insights
in the ongoing debate between 'frequentist' and 'subjectivist'
(Bayesian) statisticians.
Back to
Peter's thesis Page
Back to Peter's home page