Category Archives: Logic

Gödel's Incompleteness

I’ve mentioned Gödel’s incompleteness theorems many times on this blog, but I’ve never actually written about them in detail. I was asking, on twitter, for topics that readers would be interested in, and one thing that came up is actually go through the proof of incompleteness. I’m happy to take the time to do that incompleteness is one of the most beautiful and profound proofs that I’ve ever seen.

It takes a fair bit of effort though, so it’s going to take a couple of posts just to get through the preliminaries. What I’m going to do is work with this translation of the original paper where Gödel published his first incompleteness proof. Before we can get to the actual proof, we need to learn a bit about the particular kind of logic that he used in his proof.

The story of incompleteness is really pretty interesting. In the late 19th and early 20th centuries, mathematicians started to get really interested in formal foundations. Why, exactly, this happened is somewhat of a debate, but the story I find most likely is that it’s a result of Cantor and his naive set theory.

Set theory came along, and it seemed like such a great idea. It seems incredibly simple, and it makes many proofs in many different subject areas appear to be really simple. But there’s a huge problem: it’s not consistent.

The problem is what’s come to be known as Russell’s paradox. I’ve explained Russell’s paradox a bunch of times on the blog. For this post, I’m going to do it a bit differently. As part of my prep for this series of posts, I’ve been reading the book Gödel’s Proof, and in that book, they have a way of rephrasing Russell’s paradox which I think clarifies it nicely, so I’m going to try their version.

If you think about the set of all sets, you can partition it into two non-overlapping subsets. There’s a subset of normal sets, and a class of abnormal sets. What makes a set abnormal is that it’s self-referential – an abnormal set contains itself as a member.

So considered the set of all normal sets. Is it, itself, a normal set?

The answer is: argh! Because if the set of all normal sets is normal, then it must be a member of itself. But if it’s a member of itself, it can’t be normal. Similarly, if you start off saying that the set of all normal sets isn’t normal, then by definition, it is normal. No matter what you do, you’re screwed.

What Russell’s paradox did was show that there was a foundational problem in math. You could develop what appeared to be a valid mathematicial structure and theory, only to later discover that all the work you did was garbage, because there was some non-obvious fundamental inconsistency in how you defined it. But the way that foundations were treated simple wasn’t strong or formal enough to be able to detect, right up front, whether you’d hard-wired an inconsistency into your theory. So foundations had to change, to prevent another Cantor incident.

So, eventually, along came two mathematicians, Russell and Whitehead, and they created an amazing piece of work called the Principia Mathematica. The principia was supposed to be an ideal, perfect mathematical foundation. The Principia was supposed to have two key properties: it was supposed to consistent, and it was supposed to be complete.

Consistent meant that the statement would not allow any inconsistencies of any kind. If you used the logic and the foundantions of the Principia, you couldn’t even say anything like Russell’s paradox: you couldn’t even write it as a valid statement.
Complete meant that every true statement was provably true, every false statement was provably false, and every statement was either true or false.

A big part of how it did this was by creating a very strict stratification of logic. You could reason about a specific number, using level-0 statements. You could reason about sets of numbers using level-1 statements. And you could reason about sets of sets of numbers using level-2 statements. In a level-1 statement, you could make meta-statements about level-0 properties, but you couldn’t reason about level-1. In level-2, you could reason about level-1 and level-0, but not about level-2. This meant that you couldn’t make a set like the set of normal sets – because that set is formed using a predicate – and when you’re forming a set like “the set of proper sets”, the sets you can reason about are a level below you. You can form a level-1 set using a level-1 statement about level-0 objects. But you can’t make a level-1 statement about level-1 objects! So self-referential systems would be completely impossible in the Principia’s logic.

As a modern student of math, it’s hard to understand what a profound thing they were trying to do. We’ve grown up learning math long after incompleteness became a well-known fact of life. (I read “Gödel Escher Bach” when I was a college freshman – well before I took any particularly deep math classes – so I knew about incompleteness before I knew enough to really understand what completeness woud have meant!) The principia would have been the perfection of math, a final ultimate perfect system. There would have been nothing that we couldn’t prove, nothing in math that we couldn’t know!

What Gödel did was show that using the Principia’s own system, and it’s own syntax, that not only was the principia itself flawed, but that any possible effort like the principia would inevitably be flawed!

With the incompleteness proof, Gödel showed that even in the Principia, even with all of the effort that it made to strictly separate the levels of reasoning, that he could form self-referential statements, and that those self-referential statements were both true and unprovable.

The way that he did it was simply brilliant. The proof was a sequence of steps.

He showed that using Peano arithmetic – that is, the basic definition of natural numbers and natural number arithmetic – that you could take any principia-logic statement, and uniquely encode it as a number – so that every logical statement was a number, and ever number was a specific logical statement.
Then using that basic mechanic, he showed how you could take any property defined by a predicate in the principia’s logic, and encode it as a arithmetic property of the numbers. So a number encoded a statement, and the property of a number could be encoded arithmetically. A number, then, could be both a statement, and a definition of an arithmetic property of a stament, and a logical description of a logical property of a statement – all at once!
Using that encoding, then – which can be formed for any logic that can express Peano arithmetic – he showed that you could form a self-referential statement: a number that was a statement about numbers including the number that was statement itself. And more, it could encode a meta-property of the statement in a way that was both true, and also unprovable: he showed how to create a logical property “There is no proof of this statement”, which applied to its own numeric encoding. So the statement said, about itself, that it couldn’t be proven.

The existence of that statement meant that the Principia – and any similar system! – was incomplete. Completeness means that every true statement is provably true within the system. But the statement encodes the fact that it cannot be proven. If you could prove it, the system would be inconsistent. If you can’t, it’s consistent, but incomplete.

We’re going to go through all of that in detail. But to do that in a way where we can really understand it, we’re going to need to start by looking at the logic used by the Principia. Then we’ll look at how to encode Principia-logical statements as natural numbers using a technique called Gödel numbering, and how to express predicates numerically. And then, finally, we can look at how you can form the classic Gödel statement that shows that the Principia’s system of logic is incomplete.

Programs As Proofs: Models and Types in the Lambda Calculus

14 Replies

Lambda calculus started off with the simple, untyped lambda calculus that we’ve been talking about so far. But one of the great open questions about lambda calculus was: was it sound? Did it have a valid model?

Church found that it was easy to produce some strange and non-sensical expressions using the simple lambda calculus. In order to try to work around those problems, and end up with a consistent system, Church introduced the concept of types, producing the simply typed lambda calculus. Once types hit the scene, things really went wild; the type systems for lambda calculi have never stopped developing: people are still finding new things to do by extending the LC type system today! Most lambda calculus based programming languages are based on the Hindley-Milner lambda calculus, which is a simplification of one of the standard sophisticated typed lambda calculi called SystemF. There’s even a Lambda Cube which can categorize the different type abstractions for lambda calculus (but alas, despite its name, it’s not related to the time cube.) Once people really started to understand types, they realized that the untyped lambda calculus was really just a pathologically simple instance of the simply typed lambda calculus: a typed LC with only one base type.

The semantics of lambda calculus are easiest to talk about in a typed version. For now, I’ll talk about the simplest typed LC, known as the simply typed lambda calculus. One of the really amazing things about this, which I’ll show, is that a simply typed lambda calculus is completely semantically equivalent to an intuitionistic propositional logic: each type in the program is a proposition in the logic; each β reduction corresponds to an inference step; and each complete function corresponds to a proof! Look below for how.

Types

The main thing that typed lambda calculus adds to the mix is a concept called base types. In a typed lambda calculus, you have some universe of atomic values which you can manipulate; those values are partitioned into the *base types*. Base types are usually named by single lower-case greek letters: So, for example, we could have a type “σ”, which consists of the set of natural numbers; a type “τ” which corresponds to boolean true/false values; and a type “γ” which corresponds to strings.

Once we have basic types, then we can talk about the type of a function. A function maps from a value of one type (the type of parameter) to a value of a second type (the type of the return value). For a function that takes a parameter of type “γ”, and returns a value of type “δ”, we write its type as “γ → δ”. “→” is called the _function type constructor_; it associates to the right, so “γ → δ → ε” parses as “γ → (δ → ε)”

To apply types to the lambda calculus, we do a couple of things. First, we need a syntax update so that we can include type information in lambda terms. And second, we need to add a set of rules to show what it means for a typed program to be valid.

The syntax part is easy. We add a “:” to the notation; the colon has an expression or variable binding on its left, and a type specification on its right. It asserts that whatever is on the left side of the colon has the type specified on the right side. A few examples:

$\lambda x: \nu . x + 3$: This asserts that the parameter, $x$ has type $nu$ , which we’ll use as the type name for the natural numbers. (In case it’s hard to tell, that’s a greek letter “nu” for natural.) There is no assertion of the type of the result of the function; but since we know that “+” is a function with type $\nu \rightarrow \nu \rightarrow \nu$ , we can infer that the result type of this function will be $\nu$ .
$(\lambda x . x + 3): \nu \rightarrow \nu$: This is the same as the previous, but with the type declaration moved out, so that it asserts the type for the lambda expression as a whole. This time we can infer that $x : \nu$ because the function type is $\nu \rightarrow \nu$ , which means that the function parameter has type $\nu$ .
$\lambda x: \nu, y:\delta . \text{if}\hspace{1ex} y\hspace{1ex} \text{then}\hspace{1ex} x * x \hspace{1ex}\text{else} \hspace{1ex} x$: This is a two parameter function; the first parameter has type ν, and the second has type δ. We can infer the return type, which is ν. So the type of the full function is ν → δ → ν. This may seem surprising at first; but remember that lambda calculus really works in terms of single parameter functions; multi-parameter functions are a shorthand for currying. So really, this function is: λ x : ν . (λ y : δ . if y then x * x else x); the inner lambda is type δ → ν; the outer lambda is type ν → (δ → ν).

To talk about whether a program is valid with respect to types (aka well-typed), we need to introduce a set of rules for type inference. Then we can verify that the program is type-consistent.

In type inference, we talked about judgements. When we can infer the type of an expression using an inference rule, we call that inference a type judgement. Type inference and judgements allow us to reason about types in a lambda expression; and if any part of an expression winds up with an inconsistent type judgement, then the expression is invalid. (When Church started doing typed LC, one of the motivations was to distinguish between values representing “atoms”, and values representing “predicates”; he was trying to avoid the Godel-esque paradoxes, by using types to ensure that predicates couldn’t operate on predicates.)

Type judgements are usually written in a sequent-based notation, which looks like a fraction where the numerator consists of statements that we know to be true; and the denominator is what we can infer from the numerator. In the numerator, we normally have statements using a context, which is a set of type judgements that we already know;it’s usually written as an uppercase greek letter. If a type context includes the judgement that $x : \alpha$ , I’ll write that as $\Gamma :- x : \alpha$ .

Rule 1: Type Identity

$\frac{\mbox{}}{x : \alpha :- x: \alpha}$

This is the simplest rule: if we have no other information except a declaration of the type of a variable, then we know that that variable has the type it was declared with.

Rule 2: Type Invariance

$\frac{ \Gamma :- x:\alpha, x != y}{\Gamma + y:\beta :- x:\alpha}$

This rule is a statement of non-interference. If we know that $x:\alpha$ , then inferring a type judgement about any other term cannot change our type judgement for $x$ .

Rule 3: Function Type Inference

$\frac{\Gamma + x:\alpha :- y:\beta}{\Gamma :- (\lambda x:\alpha. y):\alpha \rightarrow \beta}$

This statement allows us to infer function types given parameter types. Ff we know the type of the parameter to a function is $\alpha$ ; and if, with our knowledge of the parameter type, we know that the type of term that makes up the body of the function is $\beta$ , then we know that the type of the function is $\alpha \rightarrow \beta$ .

Rule 4: Function Application Inference

$\frac{\Gamma :- x: \alpha \rightarrow \beta, \Gamma :- y:\alpha}{\Gamma :- (x y): \beta}$

This one is easy: if we know that we have a function that takes a parameter of type $\alpha$ and returns a value of type $\beta$ , then if we apply that function to a value of type $\alpha$ , we’ll get a value of type $\beta$ .

These four rules are it. If we can take a lambda expression, and come up with a consistent set of type judgements for every term in the expression, then the expression is well-typed. If not, then the expression is invalid.

So let’s try taking a look at a simple lambda calculus expression, and seeing how inference works on it.

$\lambda x y . y x$

Without any type declarations or parameters, we don’t know the exact type. But we do know that “x” has some type; we’ll call that “α”; and we know that “y” is a function that will be applied with “x” as a parameter, so it’s got parameter type α, but its result type is unknown. So using type variables, we can say “x:α,y:α→β”. We can figure out what “α” and “β” are by looking at a complete expression. So, let’s work out the typing of it with x=”3″, and y=”λ a:ν.a*a”. We’ll assume that our type context already includes “*” as a function of type “ν→ν→ν”, where ν is the type of natural numbers.

“λ x y . y x) 3 (λ a:ν . a * a)”: Since 3 is a literal integer, we know its type: 3:ν.
By rule 4, we can infer that the type of the expression “a*a” where “a:ν” is “ν”, and *:ν→ν→ν so therefore, by rule 3 the lambda expression has type “ν→ν”. So with type labelling, our expression is now: “(λ x y . y x) (3:ν) (λ a:ν.(a*a):ν) : ν→ν”.
So – now, we know that the parameter “x” of the first lambda must be “ν”; and “y” must be “ν→ν”; so by rule 4, we know that the type of the application expression “y x” must be “ν”; and then by rule 3, the lambda has type: “ν→(ν→ν)→ν”.
So, for this one, both α and β end up being “ν”, the type of natural numbers.

So, now we have a simply typed lambda calculus. The reason that it’s simply typed is because the type treatment here is minimal: the only way of building new types is through the unavoidable $\rightarrow$ constructor. Other typed lambda calculi include the ability to define parametric types, which are types expressed as functions ranging over types.

Programs are Proofs

Here’s where it gets really fun. Think about the types in the simple typed language calculus. Anything which can be formed from the following grammar is a lambda calculus type:

type ::= primitive | function | ( type )
primitive ::= α | β | δ | …
function ::= type→type

The catch with that grammar is that you can create type expressions which, while they are valid type definitions, you can’t write a single, complete, closed expression which will actually have that type. (A closed expression is one with no free variables.) When there is an expression that has a type, we say that the expression inhabits the type; and that the type is an inhabited type. If there is no expression that can inhabit a type, we say it’s uninhabitable. Any expression which either can’t be typed using the inference rules, or which is typed to an uninhabitable type is a type error.

So what’s the difference between inhabitable type, and an uninhabitable type?

The answer comes from something called the Curry-Howard isomorphism. For a typed lambda calculus, there is a corresponding intuitionistic logic. A type expression is inhabitable if and only if the type is a provable theorem in the corresponding logic.

The type inference rules in lambda calculus are, in fact, the same as logical inference rules in intuitionistic logic. A type $\alpha \rightarrow \beta$ can be seen as either a statement that this is a function that maps from a value of type $\alpha$ to a value of type $\beta$ , or as a logical statement that if we’re given a fact alpha $\alpha$ , we could use that to infer the truth of a fact $\beta$ .

If there’s a logical inference chain from an axiom (a given type assignment) to an inferred statement, then the inferred statement is an inhabitable type. If we have a type $\alpha \rightarrow \alpha$ , then given a inhabited type $\alpha$ , we know that $\alpha \rightarrow \alpha$ is inhabitable, because if $\alpha$ is a fact, then $\alpha \rightarrow \alpha$ is also a fact.

On the other hand, think of a different case $\alpha \rightarrow \beta$ . That’s not a theorem, unless there’s some other context that proves it. As a function type, that’s the type of a function which, without including any context of any kind, can take a parameter of type α, and return a value of a different type β. You can’t do that – there’s got to be some context which provides a value of type β – and to access the context, there’s got to be something to allow the function to access its context: a free variable. Same thing in the logic and the lambda calculus: you need some kind of context to establish “α→β” as a theorem (in the logic) or as an inhabitable type (in the lambda calculus).

It gets better. If there is a function whose type is a theorem in the corresponding intuitionistic logic, then the program that has that type is a proof of the theorem. Each beta reduction is equivalent to an inference step in the logic. This is what programming languages geeks like me mean when we say “the program is the proof”: a well-typed program is, literally, a proof its well-typed-ness.

To connect back to the discussion about models: the intuitionistic logic corresponding to the lambda calculus and intuitionistic logic are, in a deep sense, just different reflections of the same thing. We know that intuitionistic logic has a valid model. And that, in turn, means that lambda calculus is valid as well. When we show that something is true using the lambda calculus, we can trust that it’s not an artifact of an inconsistent system.

Models and Why They Matter

10 Replies

As I said in the last post, Church came up with λ-calculus, which looks like it’s a great formal model of computation. But – there was a problem. Church struggled to find a model. What’s a model, and why would that matter? That’s the point of this post. To get a quick sense of what a model is, and why it matters?

A model is basically a mapping from the symbols of a logical system to some set off objects, such that all statements that you can prove in the logical system will be true about the corresponding objects. Note that when I say object here, I don’t necessarily mean real-world physical objects – they’re just something that we can work with, which is well-defined and consistent.

Why does it matter? Because the whole point of a system like λ-calculus is because we want to use it for reasoning. When you have a logical system like λ-calculus, you’ve built this system with its rules for a reason – because you want to use it as a tool for understanding something. The model provides you with a way of saying that the conclusions you derive using the system are meaningful. If the model isn’t correct, if it contains any kind of inconsistency, then your system is completely meaningless: it can be used to derive anything.

So the search for a model for λ-calculus is really important. If there’s a valid model for it, then it’s wonderful. If there isn’t, then we’re just wasting our time looking for one.

So, now, let’s take a quick look at a simple model, to see how a problem can creep in. I’m going to build a logic for talking about the natural numbers – that is, integers greater than or equal to zero. Then I’ll show you how invalid results can be inferred using it; and finally show you how it fails by using the model.

One quick thing, to make the notation easier to read: I’m going to use a simple notion of types. A type is a set of atoms for which some particular one-parameter predicate is true. For example, if $P(x)$ is true, I’ll say that x is a member of type P. In a quantifier, I’ll say things like $forall x in P: mbox{em foo}$ to mean $forall x : P(x) Rightarrow mbox{em foo}$ . Used this way, we can say that P is a type predicate.

How do we define natural numbers using logic?

First, we need an infinite set of atoms, each of which represents one number. We pick one of them, and call it zero. To represent the fact that they’re natural numbers, we define a predicate ${cal N}(x)$ , which is true if and only if x is one of the atoms that represents a natural number.

Now, we need to start using predicates to define the fundamental properties of numbers. The most important property of natural numbers is that they are a sequence. We define that idea using a predicate, $mbox{em succ}(x,y)$ , where $mbox{em succ}(x,y)$ is true if and only if x = y + 1. To use that to define the ordering of the naturals, we can say: $forall x in {cal N}: exists y: mbox{em succ}(x, y)$ .

Or in english: every natural number has a successor – you can always add one to a natural number and get another natural number.

We can also define predecessor similarly, with two statements:

$forall x in {cal N}: exists y in {cal N}: mbox{em pred}(x, y)$ .
$forall x,y in {cal N}: mbox{em pred}(y,x) Leftrightarrow mbox{em succ}(x,y)$

So every number has a predecessor, and every number has a successor, and x is the predecessor of y if y is the successor of x.

To be able to define things like addition and subtraction, we can use successor. Let’s define addition using a predicate Sum(x,y,z) which means “z = x + y”.

$forrall x,y in {cal N}: exists z in {cal N} : Sum(x,y,z)$
$forall x,y in {cal N}: Sum(x, 0, x)$
$forall x,y,z in {cal N}: exists a,b in {cal N}: Sum(a,b,z) land mbox{em succ}(a,x) land mbox{em succ}(y,b) Rightarrow Sum(x, y, z)$

Again, in english: for any two natural numbers, there is a natural number that it their sum; x + 0 always = x; and for any natural number, x + y = z is true if (x + 1) + (y – 1) = z.

Once we have addition, subtraction is easy: $forall x,y,z in {cal N} : diff(x,y,z) Leftrightarrow sum(z,y,x)$

That’s: x-y=z if and only if x=y+z.

We can also define greater than using addition:

$forall x,y in {cal N} : x > y Leftrightarrow$ y Leftrightarrow’ style=’vertical-align:1%’ class=’tex’ alt=’forall x,y in {cal N} : x > y Leftrightarrow’ />

$mbox{em succ}(x,y)) lor$
$exists z in {cal N}: mbox{em succ}(x, z)) land exists z in {cal N}: mbox{em succ}(x,z) and z > y$ y’ style=’vertical-align:1%’ class=’tex’ alt=’exists z in {cal N}: mbox{em succ}(x, z)) land exists z in {cal N}: mbox{em succ}(x,z) and z > y’ />.

That’s x > y if you can get to x from y by adding one repeatedly.

So, we’ve got a nice definition of natural numbers here, right?

Almost. There’s one teeny little mistake.

We said that every natural number has a successor and a predecessor, and we also said that a number x is greater than a number y if you can get from y to x using a sequence of successors. That means that 0 has a predecessor, and that the predecessor of 0 is less than 0. But we’re supposed to be defining the natural numbers! And one of the standard axioms of the natural numbers is that $forall x in {cal N}: 0 le x$ . But we’ve violated that – we have both $forall x in {cal N}: 0 le x$ , and
$exists x in {cal N}: 0 > x$ x’ style=’vertical-align:1%’ class=’tex’ alt=’exists x in {cal N}: 0 > x’ />. And with a contradiction like that in the system, we can prove anything we want, anything at all. We’ve got a totally worthless, meaningless system.

That’s why mathematicians are so particular about proving the validity of their models: because the tiniest error can mean that anything you proved with the logic might not be true – your proofs are worthless.

3-Valued Semantics

9 Replies

Before we can move from three-valued logic to fuzzy logic, we need to take a look at semantics – both how conventional two-valued logic handle semantics, and how three-valued logics extend the basic semantic structure. This isn’t exactly one of the more exciting topics I’ve ever written about – but it is important, and going through it now will set the groundwork for the interesting stuff – the semantics of a true fuzzy logic.

What we’ve looked at so far has been propositional 3-valued logics. Propositional logics aren’t particularly interesting. You can’t do or say much with them. What we really care about is predicate logics. But all we need to do is take the three-valued logics we’ve seen, and allow statements to be predicate(object).

In a conventional first-order predicate logic, we define the semantics in terms of a model or interpretation of the logic. (Technically, a logic and an interpretation aren’t quite the same thing, but for our purposes here, we don’t need to get into the difference.)

An interpretation basically takes a domain consisting of a set of objects or values, and does two things:

For each atomic symbol in the logic, it assigns an object from the domain. That value is called the interpretation of the symbol.
For each predicate in the logic, it assigns a set, called the extension of the predicate. The extension contains the tuples for which the predicate is true.

For example, we could use logic to talk about Scientopia. The domain would be the set of bloggers, and the set of blogs. Then we could have predicates like “Writes”, which takes two parameters – A, and B – and which is true is A is the author of the blog B.

Then the extension of “Writes” would be a set of pairs: { (MarkCC, Good Math/Bad Math), (Scicurious, Neurotic Physiology), … }.

We can also define the counterextension, which is the set of pairs for whiche the predicate is not true. So the counterextension of “writes” would contain values like { (MarkCC, Neurotic Physiology), …}.

Given a domain of objects, and the extension of each predicate, we know the meaning of statements in the logic. We know what objects we’re reasoning about, and we know the truth or falsehood of every statement. Importantly, we don’t need to know the counterextension of a predicate: if we know the extension, then the counterextension is simple the complement of the extension.

In three-valued Lukasiewicz logic, that’s not true, for reasons that should be obvious: if $I(A)$ is the interpretation of the predicate $A$ , then the complement of $I(A)$ is not the same thing as $I(lnot A)$ . $L_3$ requires three sets for a predicate: the extension, the counterextension, and the fringe. The fringe of a predicate $P$ is the set of values $x$ for which $P(x)$ is $N$ .

To be more precise, an interpretation $I$ for first order $L_3$ consists of:

A set of values, $D$ , called the domain of the logic. This is the set of objects that the logic can be used to reason about.
For each predicate of arity (that is, taking arguments), three sets , , and , such that:
- the values of the members of all three sets are members of $D^n$ .
- the sets are mutually exclusive – that is, there is no value that is in more than one of the sets.
- the sets are exhaustive: $ext(P) cup cext(P) cup fringe(P) = D^n$ .
For each constant symbol $a$ in the logic, an assignment of $a$ to some member of $D$ : $I(a) in D$

With the interpretation, we can look at statements in the logic, and determine their truth or falsehood. But when we go through a proof, we’ll often have statements that don’t operate on specific values – they use variables inside of them. In order to make a statement have a truth value, all of the variables in that statement have to be bound by a quantifier, or assigned to a specific value by a variable assignment. So given a statement, we can frequently only talk about its meaning in terms of variable assignments for it.

So, for example, consider a simple statement: P(x,y,z). In the interpretation I, P(x, y, z) is satisfied if $(I(x), I(y), I(z)) in ext(P)$ ; it’s dissatisfied if $(I(x), I(y), I(z)) in cext(P)$ . Otherwise, $(I(x), I(y), I(z))$ must be in $fringe(P)$ , and then the statement is undetermined.

The basic connectives – and, or, not, implies, etc., all have defining rules like the above – they’re obvious and easy to derive given the truth tables for the connectives, so I’m not going to go into detail. But it does get at least a little bit interesting when we get to quantified statements. But to talk about we need to first define a construction called a variant. Given a statement with variable assignment v, which maps all of the variables in the statement to values, an x-variant $v'$ of $v$ is a variable assignment where for every variable $y$ except $x$ , $v'(y) == v(y)$ . In other words, it’s an assignment where all of the variables except x have the same value as in $v$ .

Now we can finally get to the interpretation of quantified statements. Given a statement $forall x P$ , $P$ is satisfied by a variable assignment $v$ if $P$ is satisfied by every x-variant of $v$ ; it’s dissatisfied if $P$ is dissatisfied by at least one x-variant of $v$ . Otherwise, it’s undetermined.

Similarly, an existentially quantified statement $exists x P$ is satisfied by $v$ if $P$ is satisfied by at least one x-variant of $v$ ; it’s dissatisfied if $P$ is dissatisfied by every x-variant of v. Otherwise, it’s undetermined.

Finally, now, we can get to the most important bit: what it means for a statement to be true or false in $L_3$ . A statement $S$ is $T$ (true) if it is satisfied by every variable assignment on $I$ ; it’s $F$ (false) if it’s dissatisfied by every variable assignment on $I$ , and it’s $N$ otherwise.

Fuzzy Logic vs Probability

30 Replies

In the comments on my last post, a few people asked me to explain the difference between fuzzy logic and probability theory. It’s a very good question.

The two are very closely related. As we’ll see when we start looking at fuzzy logic, the basic connectives in fuzzy logic are defined in almost the same way as the corresponding operations in probability theory.

The key difference is meaning.

There are two major schools of thought in probability theory, and they each assign a very different meaning to probability. I’m going to vastly oversimplify, but the two schools are the frequentists and the Bayesians

First, there are the frequentists. To the frequentists, probability is defined by experiment. If you say that an event E has a probability of, say, 60%, what that means to the frequentists is that if you could repeat an experiment observing the occurrence or non-occurrence of E an infinite number of times, then 60% of the time, E would have occurred. That, in turn, is taken to mean that the event E has an intrinsic probability of 60%.

The other alternative are the Bayesians. To a Bayesian, the idea of an event having an intrinsic probability is ridiculous. You’re interested in a specific occurrence of the event – and it will either occur, or it will not. So there’s a flu going around; either I’ll catch it, or I won’t. Ultimately, there’s no probability about it: it’s either yes or no – I’ll catch it or I won’t. Bayesians say that probability is an assessment of our state of knowledge. To say that I have a 60% chance of catching the flu is just a way of saying that given the current state of our knowledge, I can say with 60% certainty that I will catch it.

In either case, we’re ultimately talking about events, not facts. And those events will either occur, or not occur. There is nothing fuzzy about it. We can talk about the probability of my catching the flu, and depending on whether we pick a frequentist or Bayesian interpretation, that means something different – but in either case, the ultimate truth is not fuzzy.

In fuzzy logic, we’re trying to capture the essential property of vagueness. If I say that a person whose height is 2.5 meters is tall, that’s a true statement. If I say that another person whose height is only 2 meters is tall, that’s still true – but it’s not as true as it was for the person 2.5 meters tall. I’m not saying that in a repeatable experiment, the first person would be tall more often than the second. And I’m not saying that given the current state of my knowledge, it’s more likely than the first person is tall than the second. I’m saying that both people possess the property tall – but in different degrees.

Fuzzy logic is using pretty much the same tools as probability theory. But it’s using them to trying to capture a very different idea. Fuzzy logic is all about degrees of truth – about fuzziness and partial or relative truths. Probability theory is interested in trying to make predictions about events from a state of partial knowledge. (In frequentist terms, it’s about saying that I know that if I repeated this 100 times, E would happen in 60; in Bayesian, it’s precisely a statement of partial knowledge: I’m 60% certain that E will happen.) But probability theory says nothing about how to reason about things that aren’t entirely true or false.

And, in the other direction: fuzzy logic isn’t particularly useful for talking about partial knowledge. If you allowed second-order logic, you could have fuzzy meta-predicates that described your certainty about crisp first-order predicates. But with first order logic (which is really where we want to focus our attention), fuzzy logic isn’t useful for the tasks where we use probability theory.

So probability theory doesn’t capture the essential property of meaning (partial truth) which is the goal of fuzzy logic – and fuzzy logic doesn’t capture the essential property of meaning (partial knowledge) which is the goal of probability theory.

More 3-valued logic: Lukasiewicz and Bochvar

13 Replies

Last time I wrote about fuzzy logic, we were looking at 3-valued logics, and I mentioned that there’s more than one version of 3-valued logic. We looked at one, called $K^S_3$ , Kleene’s strong 3-valued logic. In $K^S_3$ , we extended a standard logic so that for any statement, you can say that it’s true (T), false (F), or that you don’t know (N). In this kind of logic, you can see some of the effect of uncertainty. In many ways, it’s a very natural logic for dealing with uncertainty: “don’t know” behaves in a very reasonable way.

For example, suppose I know that Joe is happy, but I don’t know if Jane is happy. So the truth value of “Happy(Joe)” is T; the truth value of “Happy(Jane)” is N. In Kleene, the truth value of “Happy(Joe) ∨ Happy(Jane)” is T; since “Happy(Joe)” is true, then “Happy(Joe) ∨ anything” is true. And “Happy(Joe) ∧ Happy(Jane)” is N; since we know that Joe is happy, but we don’t know whether or not Jane is happy, we can’t know whether both Joe and Jane are happy. It works nicely. It’s a rather vague way of handling vagueness, (that is, it lets you say you’re not sure, but it doesn’t let you say how not sure you are) but in so far as it goes, it works nicely.

A lot of people, when they first see Kleene’s three-valued logic think that it makes so much sense that it somehow defines the fundamental, canonical three-valued logic in the same way that, say, first order predicatelogic defines the fundamental two-valued predicate logic.

It isn’t.

There are a bunch of different ways of doing three-valued logic. The difference between them is related to the meaning of the third value – which, in turn, defines how the various connectives work.

There are other 3-valued logics. We’ll talk about two others. There’s Bochvar’s logic, and there’s Lukasiewicz’s. In fact, we’ll end up building our fuzzy logic on Lukasiewicz’s. But Bochvar is interesting in its own right. So we’ll take a look at both.

Continue reading →

Introducing Three-Valued Logic

52 Replies

So, finally, I’m getting around to three-valued logic!

To start, why are we even looking at three-valued logic? We want to get to fuzzy reasoning, and three valued logic really doesn’t do the job. But it’s a useful starting point, because in general, most of us have only worked with standard logics where statements are either true or false. In Fuzzy logic, we’re don’t have that – and that can lead to a lot of complexity. So looking at something simple that shows us how something either than “true” or “false” can actually make sense in logic.

Continue reading →

Leading in to Fuzzy Logic

94 Replies

So, as I said in the intro to the new Scientopical Good Math/Bad Math, I’m
going to writing some articles about fuzzy logic.

Before diving into the fuzzy stuff, let’s start at the beginning. What
is fuzzy logic? More fundamentally, what is logic? And what problem does fuzziness solve?

So, first, what’s logic?

Logic isn’t really a single thing. There are many different logics, for different purposes. But at the core, all logics have the same basic underlying concept: a logic is a symbolic reasoning system. A logic is a system for expressing statements in a form where you reason about them in a completely mechanical way, without even knowing what they mean. If you know that some collection of statements is true, using the logic, you can infer – meaning derive from a sequence of purely mechanical steps – whether or not other statements are true. If the logic is formulated properly, and you started with an initial set of true statements, then any conclusions drawn using the inference process from those statements will also be true.

When we say logic, we tend to automatically think of a particular logic: the first order predicate logic, which is the most common, fundamental logic used in both mathematics, and in rhetoric and debate. But there are an astonishing number of different logics for different purposes. Fuzzy logic is one particular
variation, which tries to provide a way of reasoning about vagueness.

Continue reading →

A Quick Bit of Temporal Logic: Introducing CTL

Mr. Spock is Not Logical (book draft excerpt)

Good Math/Bad Math

The beauty of math; the humor of stupidity.

Category Archives: Logic

Gödel's Incompleteness

Like this:

Programs As Proofs: Models and Types in the Lambda Calculus

Types

Programs are Proofs

Like this:

Models and Why They Matter

Like this:

3-Valued Semantics

Like this:

Fuzzy Logic vs Probability

Like this:

More 3-valued logic: Lukasiewicz and Bochvar

Like this:

Introducing Three-Valued Logic

Like this:

Leading in to Fuzzy Logic

Like this:

A Quick Bit of Temporal Logic: Introducing CTL

Like this:

Mr. Spock is Not Logical (book draft excerpt)

Like this:

Share this:

Like this:

Types

Programs are Proofs

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this: