|
Learning from experience:
Explanation of how one can learn from
theories and their logical consequences if these
consequences are verified or falsified. The explanation to be given is
closely related to Bayesian Reasoning, but cast in such a way that certain
problems with that are avoided, and all relevant assumptions are clearly stated.
The start is standard
propositional
logic (PL) based on & and ~, that are sufficient to define all standard
connectives. This is extended to temporal propositional logic and
temporal probability theory as follows, noting that "p(α,Q)" is "person
α's probability for Q" ("α" = "alpha").
A1. (P&Q).t
IFF (P).t & (Q).t
A2.
(p(α,P)=p(α,Q)).t IFF p(α,P).t = p(α,Q).t
That is:
Propositions are temporally relativized by an operator ".t" attached to them,
read as "at t" and this distributes as one expects over & and =, which are
together with ~ the basic logical operators. The times t are supposed to be discrete and
ordered by <=. This will be needed to keep track of the various truth-values and
probabilities of propositions at various times.
A3. (EP)(Et)(Ex) ( 0 <= p(α,P).t=x <= 1)
A4. (EP)(EQ)(Et) ( (Ex)(p(α,P).t=x) & (Ey)(p(α,Q).t=y) &
~(Ez)(p(α,P&Q).t=z) )
For every person α there are propositions with some personal probability
at
some t - and so it may be that there are for α at t propositions which do not
have a personal probability. And for every person α there are at some t pairs of propositions such that
both propositions have some probability but their conjunction doesn't.
A3 and A4 entail that there may well be propositions for α that do not
have α's personal probability for some reason, and there may well be
conjunctions for which α does not have a probability even if α has probabilities
for the conjuncts. This differs from non-personal probability, where every
logical possibility has some probability always, even if it is unknown.
A5. p(α,Q|T).t=x IFF p(α,Q&T).t : p(α,T).t=x
A6. p(α,~Q).t=1-y IFF p(α,Q).t=y
A7. p(α,~Q|T).t=1-x IFF p(α,Q|T).t=x
This defines conditional probability in A5 and defines ~Q for unconditional and
conditional probabilities in A6 and A7. Note this is a personal probability of α: It
is up to α to decide what it is, and often though not necessarily it is what α
believes that the real probability is. The reading of "p(α,Q|T).t" is
"the probability for α of Q given T at t".
With two conditional probabilities plus p(α,T).t we can calculate all
entries for a fundamental table, that lists the probabilities for all
logical alternatives. Indeed here is the fundamental table in three
different convenient forms, where the .t are left out as understood:
|
I |
II |
III |
|
| t |
T |
~T |
T |
~T |
T |
~T |
|
| Q |
a |
b |
p(α,Q&T) |
p(α,Q&~T) |
p(α,Q|T)*p(α,T) |
p(α,Q|~T)*p(α,~T) |
p(α,QT) |
| ~Q |
c |
d |
p(α,~Q&T) |
p(α,~Q&~T) |
p(α,~Q|T)*p(α,T) |
p(α,~Q|~T) |
p(α,~QT) |
|
p(α,T) |
p(α,~T) |
p(α,T) |
p(α,~T) |
p(α,T) |
p(α,~T) |
1 |
From A1-A7 we get something much like standard probability theory, for those
propositions that a does have probabilities for, for we can easily prove:
T1. p(α,T).t=p(α,Q&T).t+p(α,~Q&T).t
For p(α,Q&T).t+p(α,~Q&T).t = p(α,Q|T).t*p(α,T).t+p(α,~Q|T).t*p(α,T).t
by A5
= (p(α,Q|T).t+p(α,~Q|T).t)*p(α,T).t by
algebra
= p(α,T).t
by A7
Likewise
T2. p(α,QT).t = p(α,Q|T).t*p(α,T).t + p(α,Q|~T).t*p(α,~T).t
since p(α,Q|T).t*p(α,T).t+p(α,Q|~T).t*p(α,~T).t = p(α,Q&T).t+p(α,Q&~T).t by A5. This is
written as p(α,QT).t to indicate p(α,Q).t is calculated with respect to
p(α,T).t and two of a's conditional probabilities involving Q and T. This will become of
some importance below.
Since also by A6
T3. p(α,QT).t+p(α,~QT).t = p(α,T).t+p(α,~T).t = 1
the fundamental table has been justified at this point. (The α,b,c,d entries
in it are for conventient abbreviation of the four possible logical
alternatives.)
A8. (T |-α Q).t
IFF p(α,Q|T).t=1 V p(α,T).t=0
A9. (|-α Q).t
IFF p(α,Q).t=1
This defines logical implication and verified formula for a in
terms of personal probability of a. Note that in fact only 1 and 0 are used
here, and that thus we have the basis for standard bi-valent propositonal logic.
T4. (T |-α Q).t -->
p(α,T).t <= p(α,Q).t
For suppose (T |-α Q).t. i.e. by A8 p(α,Q|T).t=1 V p(α,T).t=0.
In case p(α,T).t=0, we have p(α,T).t <= p(α,Q).t.
So suppose p(α,T).t>0. Then p(α,Q|T).t=1 and so p(α,~Q&T).t=0 whence
again p(α,T).t <= p(α,Q).t. Thus T4.
Therefore also, defining (T -||-α Q).t =def (T |-α Q).t
& (Q |-α T).t
T5. (T -||-α Q).t --> p(α,T).t =
p(α,Q).t
which is to say that logical equivalents have equal probabilities. Again,
this is like standard probablity theory, but relativized to a's judgements.
Now we are going to say how one may learn from experience.
For given p(α,Q|T).t=h, p(α,Q|~T).t=i
and
p(α,T).t=j, with h≠i:
A10. p(α,Q|T).t+1=h IFF p(α,Q|T).t=h
A11. p(α,Q|~T).t+1=i IFF p(α,Q|~T).t=i
A12. p(α,T).t+1 = p(α,T).t IFF
0 < p(α,Q).t < 1
A13. p(α,T).t+1 = p(α,T|Q).t
IFF p(α,Q).t=1
A14. p(α,T).t+1 = p(α,T|~Q).t
IFF p(α,~Q).t=1
Any given set p(α,Q|T).t, p(α,Q|~T).t,
p(α,T).t where T is a theory is called a basic theory for α if
p(α,Q|T).t≠p(α,Q|~T).t, and A10 and A11 insist that the conditional
probabilities in a basic theory remain constant in time. A12 till A14 state
how p(α,T).t in a basic theory changes or not depending on what α verifies
about the Q in the set: It remains constant if α neither verifies Q nor ~Q and
changes with Q or ~Q if either of these are verified for α.
To show how this works put p(α,Q|T).t=h, p(α,Q|~T).t=i
and p(α,T).t=j. Then suppose p(α,Q).t=1. We have,
noting also that ~j=1-j
(*) p(α,T).t+1 = p(α,T|Q).t
by A13
= (p(α,Q|T).t
: p(α,QT).t) * p(α,T).t
by A5 and T5, for p(α,Q&T).t=p(α,T&Q).t
=
(h : hj+i~j) * j
by adopted conventions
Thus p(α,T|Q).t+1= p(α,Q|T).t:p(α,QT).t *
p(α,T).t and so the new theory p(α,T).t+1 differs by
a multiplicative factor
p(α,Q|T).t:p(α,QT).t from the old p(α,T).t
. Now clearly
p(α,Q|T).t:p(α,QT).t >= 1 IFF h : hj+i~j >= 1
IFF h >= hj+i~j
IFF h~j >= i~j
using ~j=1-j
IFF h >= i
supposing ~j>0
IFF p(α,Q|T).t
>= p(α,Q|~T).t
Therefore the direction of the degree of confirmation depends only on the conditional probabilities:
The multiplicative factor p(α,Q|T).t:p(α,QT).t
equals or exceeds 1 iff p(α,Q|T).t equals or exceeds p(α,Q|~T).t
. And as the conditional probabilities remain constant by A10 and A11 this
remains constant.
Next, the problem for Bayesian Reasoning that if p(α,Q).t =1 i.e.
|-α Q.t then also, by probability theory, p(α,T|Q).t=p(α,T).t
is avoided by the following theorem, formulated for the same basic theory
as before, and using the definition (T α-rel Q) =def p(α,Q&T)≠p(α,Q)*p(α,T) and
(T α-irr Q) =def ~(T α-rel Q).
T6. (T α-rel Q) --> p(α,QT) < 1
Proof: p(α,QT) = p(α,Q|T)*p(α,T)+ p(α,Q|~T)*p(α,~T) by T2
= hj+i~j
by adopted conventions
Now hj+i~j = 1 IFF
hj+i-ij = 1 IFF
using ~j=1-j
hj-ij = (1-i) IFF
(h-i)*j = (1-i) IFF
((h-i):(1-i))*j = 1
supposing i<1
Lemma: ((h-i):(1-i))*j = 1 IFF h=1 & j=1 & h>i, assuming h, i and j are probabilities.
Proof: Make the assumption about h, i and j. Suppose the RHS of the
equivalence. Then ((h-i):(1-i))*j turns to ((1-i):(1-i)). Since
h=1 and h>i it follows ((1-i):(1-i))=1. Next suppose the LHS. Assume h<=i. Then
h-i<=0 and so ((h-i):(1-i))*j ≠ 1. So h>i follows
from the LHS. Assume h<1. Then ((h-i):(1-i)) < 1 and ((h-i):(1-i))*j
≠ 1. So h=1 follows from the LHS. Since we
have proved h=1, ((h-i):(1-i))*j = 1, and so j=1. Thus the lemma has been
proved.
Now if j=1 then ~j=0 and then (T α-irr
Q) by T4.
So if (T α-rel Q) then j<1 and so hj+i~j < 1 by the lemma. Therefore indeed (T
α-rel Q) --> p(α,QT)< 1. Qed.
Hence it is quite possible that p(α,QK).t =1 & p(α,QT).t
< 1. The probabilities involved in a basic theory need not be the same as
those of propositions that are not involved in it, but may be used to update the
probabilities in a basic theory by (*). And indeed one may write also p(α,QTK).t
to explicate K as well.
|