|
Bayes' Theorem: This has many
formulations, but essentially comes down to the observation that one can
learn from experience using probability theory while avoiding the
fallacy of affirming the consequent. This all is based in the end on the
following theorem of elementary probability theory:
p(T|P)=p(P|T)*p(T):p(P). In this equation, the factors have
standard names:
p(T|P) is the posterior probability (of the theory T given the
data P)
p(P|T) is the likelihood (of the data P given the theory T)
p(T) is the prior probability (of the theory T)
p(P) is the probability of the data
1. Usefulness of Bayes' Theorem:
The strength, interest and usefulness of Bayes' Theorem can be
explained by noting that if T is a theory and P a prediction of the
theory, one can recalculate the probability of T given that P is true
(or false) if one has the probability of P given T (which is one's
theory T from which one has derived that P has a certain probability if
T is true) together with the probability of
one's theory T and the probability of one's prediction P.
A classical example involves finding the orbit of a comet from a few
observations.
Suppose one has a theory T about that orbit that may have a low initial
probability p(T) (since there are many possible orbits), and a
prediction p(P) that the comit will be at a certain place at a certain
time of which the probability also will be low apart from T (since the
comet may in fact be at many possible places), although p(P|T) i.e. that
the comit will be at a certain place at a certain time if the theory is
true as a rule will be high (since else one wouldn't propose the
theory).
Now it follows by Bayes' Theorem i.e. the above elementary
formula of probability theory that p(T|P) i.e. the probability of the
theory T if the prediction is true will be much higher than p(T) was
before the prediction was verified, and indeed in the ratio p(P|T):p(P).
Thus, if p(P|T) was 90/100 and p(P) was 1/100, then p(T|P)=90*p(T),
which may make the new probability p(T|P) appreciable even if p(T) may
have been quite low to start with (say also 1/100, e.g. because its
prediction P is that low). Thus, the new p(T|P)=90/100, whereas the old
pr(T), before finding that P is true, was 1/100.
2. Problem of Bayes' Theorem: The main problem involved in
Bayes' Theorem is that it often is not clear how one can establish the
three probabilities one requires to use it, namely
p(P|T), p(P) and p(T).
This is especially so with p(T), in that one often can make plausible
cases for
p(P|T) (it must be high if the explanation is to be useful) and p(P)
(there often can be given evidence that if T is not true, then P is not
probable at all), but since theories cannot be counted like blueberries
or particular instances of kinds of fact, there often seems to be no
plausible way to fix the probability of a theory.
There are several ways to circumvent the problem, but the usual ones
(such as so-called likelihood-ratios: p(P|T):p(P|~T) or p(T):p(~T)) all
seem to involve a considerable element of subjectivity: In the end, it
all comes down to one's subjective degree of belief in T.
For those who believe that probability is subjective, this is no
objection, and indeed believers in subjective probability feel quite
free in using Bayes' Theorem, while also some have converted to a
subjective interpretation of probability theory precisely because it
permits one to apply Bayes' Theorem.
The problem with this, apart from other objections to subjective
intepretations probability theory, is that in practice it won't help
much, for example with fanatics.
Take Darwin's theory of evolution. This accounts quite well for may
otherwise problematic facts, and has quite a few succesful predictions
to its credit that do not follow from other theories, such as divine
providence. Thus it can be seen as being quite well confirmed by the
evidence and by Bayesian reasoning - unless one is both a believer in
the subjective interpretation of probability and in divine providence,
and therefore fixes the probability of the Darwinian theory as so small (say, in
the order of 10-1000) that no practical amount of evidence
can much change this (except with verified predictions of the same order
of improbability).
3. A possible solution: One possible solution is to make a
special assumption about the probability of a theory. This follows,
after a definition of a term that occurs in the assumption to be made: The proper
consequences of a theory T are those statements that follow from T
but do not follow from ~T.
Now one may assume the following
Theoretical Probability Postulate or TPP
- TPP: The
probability x of a theory T at any time t is the probability of the
least probable proper consequence that is known to follow from T at
time t.
The justification for this assumption
is that we certainly know that pr(T) cannot be higher than the
probability of its least probable proper consequence, for that follows
from probability theory, whereas the stated conventional assumption
answers the problem how to attribute a probability to a theory, and
indeed does uniquely so, and with empirical justification, namely that
least probable proper consequence of the theory.
Thus our assumption for the
probability of a theory T at time t is that it is the maximum of what
it may be at t, given the probabilities of the known proper
consequences of T. This is an assumption; it is consistent with
probability theory; it is based on the known facts about what T
entails; and it is a convention.
|