Simply Logical

Peter Flach | http://www.cs.bris.ac.uk/~flach/SimplyLogical.html

Interactive lab examples

Simply Logical

Chapter 1: A brief introduction to clausal logic

In this chapter, we will introduce clausal logic as a formalism for representing and reasoning with knowledge. The aim of this chapter is to acquaint the reader with the most important concepts, without going into too much detail. The theoretical aspects of clausal logic, and the practical aspects of Logic Programming, will be discussed in Chapters 2 and 3.

The London Underground example

Our Universe of Discourse in this chapter will be the London Underground, of which a small part is shown in fig. 1.1. We will try to capture this information in logical statements, describing which stations are directly connected by which lines.

The London Underground example (ctd.)

Figure 1.1.

Part of the London Underground. Reproduced by permission of London Regional Transport (LRT Registered User No. 94/1954).

Zoom in on figures (or other display elements) by right-clicking or alt-clicking.

The London Underground example (ctd.)

If we follow the lines from left to right (Northern downwards), we come up with the following 11 formulas:

connected(bond_street,oxford_circus,central).
connected(oxford_circus,tottenham_court_road,central).
connected(bond_street,green_park,jubilee).
connected(green_park,charing_cross,jubilee).
connected(green_park,piccadilly_circus,piccadilly).
connected(piccadilly_circus,leicester_square,piccadilly).
connected(green_park,oxford_circus,victoria).
connected(oxford_circus,piccadilly_circus,bakerloo).
connected(piccadilly_circus,charing_cross,bakerloo).
connected(tottenham_court_road,leicester_square,northern).
connected(leicester_square,charing_cross,northern).

Clicking the cogwheel at the top right of the program box opens an interactive SWISH window, in which you can run queries against the given program. Press Run! to use the given query, look under Examples to see other suggested queries, and be sure to look beyond the first answer!

The London Underground example (ctd.)

Let’s define two stations to be nearby if they are on the same line, with at most one station in between. This relation can also be represented by a set of logical formulas:

nearby(bond_street,oxford_circus).
nearby(oxford_circus,tottenham_court_road).
nearby(bond_street,tottenham_court_road).
nearby(bond_street,green_park).
nearby(green_park,charing_cross).
nearby(bond_street,charing_cross).
nearby(green_park,piccadilly_circus).
nearby(piccadilly_circus,leicester_square).
nearby(green_park,leicester_square).
nearby(green_park,oxford_circus).
nearby(oxford_circus,piccadilly_circus).
nearby(piccadilly_circus,charing_cross).
nearby(oxford_circus,charing_cross).
nearby(tottenham_court_road,leicester_square).
nearby(leicester_square,charing_cross).
nearby(tottenham_court_road,charing_cross).

The London Underground example (ctd.)

These 16 formulas have been derived from the previous 11 formulas in a systematic way. If X and Y are directly connected via some line L, then X and Y are nearby. Alternatively, if there is some Z in between, such that X and Z are directly connected via L, and Z and Y are also directly connected via L, then X and Y are also nearby. We can formulate this in logic as follows:

nearby(X,Y):-connected(X,Y,L).
nearby(X,Y):-connected(X,Z,L),connected(Z,Y,L).

The London Underground example (ctd.)

In these formulas, the symbol ‘ :- ’ should be read as ‘if’, and the comma between connected(X,Z,L) and connected(Z,Y,L) should be read as ‘and’. The uppercase letters stand for universally quantified variables, such that, for instance, the second formula means:

For any values of X, Y, Z and L, X is nearby Y if X is directly connected to Z via L, and Z is directly connected to Y via L.

The London Underground example (ctd.)

It is sometimes convenient to treat variables not occurring to the left of the if-symbol ‘ :- ’ as existentially quantified within the if-part, leading to the following (equivalent) reading:

For any values of X and Y, X is nearby Y if there exist values of Z and L such that X is directly connected to Z via L, and Z is directly connected to Y via L.

Facts and rules

We now have two definitions of the nearby-relation, one which simply lists all pairs of stations that are nearby each other, and one in terms of direct connections. Logical formulas of the first type, such as

nearby(bond_street,oxford_circus).

will be called facts, and formulas of the second type, such as

nearby(X,Y):-connected(X,Z,L),connected(Z,Y,L).

will be called rules.

Facts and rules (ctd.)

Facts express unconditional truths, while rules denote conditional truths, i.e. conclusions which can only be drawn when the premises are known to be true. Rules are also called clauses, and the premises of a clause (the bit following the if-symbol ‘ :- ’) are also called the body of the clause, while the conclusion is called its head.

Obviously, we want these two definitions to be equivalent: for each possible query, both definitions should give exactly the same answer. We will make this more precise in the next section.

Facts and rules (ctd.)

Exercise 1.1.

Two stations are ‘not too far’ if they are on the same or a different line, with at most one station in between. Define rules for the predicate not_too_far.

not_too_far(X,Y):-true. % replace 'true' with your definition
not_too_far(X,Y):-true. % add more clauses as needed

1.1 Answering queries

A query like ‘which station is nearby Tottenham Court Road?’ will be written as

?-nearby(tottenham_court_road,W).

where the prefix ‘ ?- ’ indicates that this is a query rather than a fact. An answer to this query, e.g. ‘Leicester Square’, will be written { W → leicester_square }, indicating a substitution of values for variables, such that the statement in the query, i.e.

?-nearby(tottenham_court_road,leicester_square).

is true.

Now, if the nearby-relation is defined by means of a list of facts, answers to queries are easily found: just look for a fact that matches the query, by which is meant that the fact and the query can be made identical by substituting values for variables in the query. Once we have found such a fact, we also have the substitution which constitutes the answer to the query.

Using rules to answer queries

If rules are involved, query-answering can take several of these steps. For answering the query ?-nearby(tottenham_court_road,W), we match it with the conclusion of the rule

nearby(X,Y):-connected(X,Y,L)

yielding the substitution { X → tottenham_court_road, Y → W }. We then try to find an answer for the premises of the rule under this substitution, i.e. we try to answer the query

?-connected(tottenham_court_road,W,L).

Using rules to answer queries (ctd.)

That is, we can find a station nearby Tottenham Court Road, if we can find a station directly connected to it. This second query is answered by looking at the facts for direct connections, giving the answer { W → leicester_square, L → northern }. Finally, since the variable L does not occur in the initial query, we just ignore it in the final answer, which becomes { W → leicester_square } as above.

In fig. 1.2, we give a graphical representation of this process. Since we are essentially proving that a statement follows logically from some other statements, this graphical representation is called a proof tree.

Using rules to answer queries (ctd.)

Figure 1.2.

A proof tree for the query ?-nearby(tottenham_court_road,W).

Resolution

The steps in fig. 1.2 follow a very general reasoning pattern:

to answer a query ?- $Q_1, Q_2, \ldots, Q_n$, find a rule $A$ :- $B_1, \ldots, B_m$ such that $A$ matches with $Q_1$, and answer the query ?- $B_1, \ldots, B_m, Q_2, \ldots, Q_n$.

This reasoning pattern is called resolution, and we will study it extensively in Chapters 2 and 3. Resolution adds a procedural interpretation to logical formulas, besides their declarative interpretation (they can be either true or false). Due to this procedural interpretation, logic can be used as a programming language.

Proof by refutation

The resolution proof process makes use of a technique that is known as reduction to the absurd: suppose that the formula to be proved is false, and show that this leads to a contradiction, thereby demonstrating that the formula to be proved is in fact true. Such a proof is also called a proof by refutation.

For instance, if we want to know which stations are nearby Tottenham Court Road, we negate this statement, resulting in ‘there are no stations nearby Tottenham Court Road’.

Proof by refutation (ctd.)

In logic, this is achieved by writing the statement as a rule with an empty conclusion, i.e. a rule for which the truth of its premises would lead to falsity:

:-nearby(tottenham_court_road,W).

Thus, the symbols ‘ ?- ’ and ‘ :- ’ are in fact equivalent.

A contradiction is found if resolution leads to the empty rule, of which the premises are always true (since there are none), but the conclusion is always false. Conventionally, the empty rule is written as ‘ □ ’.

Success set

At the beginning of this section, we posed the question: can we show that our two definitions of the nearby-relation are equivalent? As indicated before, the idea is that to be equivalent means to provide exactly the same answers to the same queries. To formalise this, we need some additional definitions.

A ground fact is a fact without variables. Obviously, if G is a ground fact, the query ?-G never returns a substitution as answer: either it succeeds (G does follow from the initial assumptions), or it fails (G does not). The set of ground facts G for which the query ?-G succeeds is called the success set.

Success set (ctd.)

Thus, the success set for our first definition of the nearby-relation consists simply of those 16 formulas, since they are ground facts already, and nothing else is derivable from them. The success set for the second definition of the nearby-relation is constructed by applying the two rules to the ground facts for connectedness. Thus we can say: two definitions of a relation are (procedurally) equivalent if they have the same success set (restricted to that relation).

Success set (ctd.)

Exercise 1.2.

Construct the proof trees for the query

?-nearby(W,charing_cross).

1.2 Recursion

Until now, we have encountered two types of logical formulas: facts and rules. There is a special kind of rule which deserves special attention: the rule which defines a relation in terms of itself. This idea of ‘self-reference’, which is called recursion, is also present in most procedural programming languages.

Recursion is a bit difficult to grasp, but once you’ve mastered it, you can use it to write very elegant programs, e.g.

IF N=0
THEN FAC:=1
ELSE FAC:=N*FAC(N-1).

is a recursive procedure for calculating the factorial of a given number, written in a Pascal-like procedural language. However, in such languages iteration (looping a pre-specified number of times) is usually preferred over recursion, because it uses memory more efficiently.

Reachability as a recursive predicate

In Prolog, however, recursion is the only looping structure [1] . (This does not necessarily mean that Prolog is always less efficient than a procedural language, because there are ways to write recursive loops that are just as efficient as iterative loops, as we will see in section 3.6.)

Perhaps the easiest way to think about recursion is the following: an arbitrarily large chain is described by describing how one link in the chain is connected to the next.

Reachability as a recursive predicate (ctd.)

For instance, let us define the relation of reachability in our underground example, where a station is reachable from another station if they are connected by one or more lines. We could define it by the following 20 ground facts:

reachable(bond_street,charing_cross).
reachable(bond_street,green_park).
reachable(bond_street,leicester_square).
reachable(bond_street,oxford_circus).
reachable(bond_street,piccadilly_circus).
reachable(bond_street,tottenham_court_road).
reachable(green_park,charing_cross).
reachable(green_park,leicester_square).
reachable(green_park,oxford_circus).
reachable(green_park,piccadilly_circus).
reachable(green_park,tottenham_court_road).
reachable(leicester_square,charing_cross).
reachable(oxford_circus,charing_cross).
reachable(oxford_circus,leicester_square).
reachable(oxford_circus,piccadilly_circus).
reachable(oxford_circus,tottenham_court_road).
reachable(piccadilly_circus,charing_cross).
reachable(piccadilly_circus,leicester_square).
reachable(tottenham_court_road,charing_cross).
reachable(tottenham_court_road,leicester_square).

Reachability as a recursive predicate (ctd.)

Since any station is reachable from any other station by a route with at most two intermediate stations, we could instead use the following (non-recursive) definition:

reachable(X,Y):-connected(X,Y,L).
reachable(X,Y):-connected(X,Z,L1),connected(Z,Y,L2).
reachable(X,Y):-connected(X,Z1,L1),connected(Z1,Z2,L2),
                connected(Z2,Y,L3).

Reachability as a recursive predicate (ctd.)

Of course, if we were to define the reachability relation for the entire London underground, we would need a lot more, longer and longer rules. Recursion is a much more convenient and natural way to define such chains of arbitrary length:

reachable(X,Y):-connected(X,Y,L).
reachable(X,Y):-connected(X,Z,L),reachable(Z,Y).

The reading of the second rule is as follows: ‘ Y is reachable from X if Z is directly connected to X via line L, and Y is reachable from Z ’.

We can now use this recursive definition to prove that Leicester Square is reachable from Bond Street (fig. 1.3).

Reachability as a recursive predicate (ctd.)

Figure 1.3.

A proof tree for the query ?-reachable(bond_street,W).

Alternate proofs

However, just as there are several routes from Bond Street to Leicester Square, there are several alternative proofs of the fact that Leicester Square is reachable from Bond Street. An alternative proof is given in fig. 1.4.

Figure 1.4.

Alternative proof tree for the query ?-reachable(bond_street,W).

Alternate proofs (ctd.)

The difference between these two proofs is that in the first proof we use the fact

connected(oxford_circus,tottenham_court_road,central).

while in the second proof we use

connected(oxford_circus,piccadilly_circus,bakerloo).

There is no reason to prefer one over the other, but since Prolog searches the given formulas top-down, it will find the first proof before the second. Thus, the order of the clauses determines the order in which answers are found. As we will see in Chapter 3, it sometimes even determines whether any answers are found at all.

Alternate proofs (ctd.)

Exercise 1.3.

Give a third proof tree for the answer { W → leicester_square }, and change the order of the facts for connectedness, such that this proof tree is constructed first.

Search and backtracking

In other words, Prolog’s query-answering process is a search process, in which the answer depends on all the choices made earlier.

A important point is that some of these choices may lead to a dead-end later. For example, if the recursive formula for the reachability relation had been tried before the non-recursive one, the bottom part of fig. 1.3 would have been as in fig. 1.5. This proof tree cannot be completed, because there are no answers to the query ?-reachable(charing_cross,W), as can easily be checked. Prolog has to recover from this failure by climbing up the tree, reconsidering previous choices. This search process, which is called backtracking, will be detailed in Chapter 5.

Search and backtracking (ctd.)

Figure 1.5.

A failing proof tree.

1.3 Structured terms

Finally, we illustrate the way Prolog can handle more complex datastructures, such as a list of stations representing a route. Suppose we want to redefine the reachability relation, such that it also specifies the intermediate stations. We could adapt the non-recursive definition of reachable as follows:

reachable0(X,Y):-connected(X,Y,L).
reachable1(X,Y,Z):-connected(X,Z,L1),
                   connected(Z,Y,L2).
reachable2(X,Y,Z1,Z2):-connected(X,Z1,L1),
                       connected(Z1,Z2,L2),
                       connected(Z2,Y,L3).

The suffix of reachable indicates the number of intermediate stations; it is added to stress that relations with different number of arguments are really different relations, even if their names are the same. The problem now is that we have to know the number of intermediate stations in advance, before we can ask the right query. This is, of course, unacceptable.

Functors

We can solve this problem by means of functors. A functor looks just like a mathematical function, but the important difference is that functor expressions are never evaluated to determine a value. Instead, they provide a way to name a complex object composed of simpler objects.

For instance, a route with Oxford Circus and Tottenham Court Road as intermediate stations could be represented by

route(oxford_circus,tottenham_court_road)

Note that this is not a ground fact, but rather an argument for a logical formula.

Functors (ctd.)

The reachability relation can now be defined as follows:

reachable(X,Y,noroute):-connected(X,Y,L).
reachable(X,Y,route(Z)):-connected(X,Z,L1),
                         connected(Z,Y,L2).
reachable(X,Y,route(Z1,Z2)):-connected(X,Z1,L1),
                             connected(Z1,Z2,L2),
                             connected(Z2,Y,L3).

The query ?-reachable(oxford_circus,charing_cross,R) now has three possible answers:

{ R → route(piccadilly_circus) }
{ R → route(tottenham_court_road,leicester_square) }
{ R → route(piccadilly_circus,leicester_square) }

Functors, recursion, and trees

As argued in the previous section, we prefer the recursive definition of the reachability relation, in which case we use functors in a somewhat different way.

reachable(X,Y,noroute):-connected(X,Y,L).
reachable(X,Y,route(Z,R)):-connected(X,Z,L),
                           reachable(Z,Y,R).

Functors, recursion, and trees (ctd.)

At first sight, there does not seem to be a big difference between this and the use of functors in the non-recursive program. However, the query

?-reachable(oxford_circus,charing_cross,R).

now has the following answers:

{ R → route(tottenham_court_road,
            route(leicester_square,noroute)) }
{ R → route(piccadilly_circus,noroute) }
{ R → route(piccadilly_circus,
            route(leicester_square,noroute)) }

Functors, recursion, and trees (ctd.)

The functor route is now also recursive in nature: its first argument is a station, but its second argument is again a route. For instance, the object

route(tottenham_court_road,route(leicester_square,noroute))

can be pictured as in fig. 1.6.

Figure 1.6.

A complex object as a tree.

Functors, recursion, and trees (ctd.)

Such a figure is called a tree (we will have a lot more to say about trees in chapter 4). In order to find out the route represented by this complex object, we read the leaves of this tree from left to right, until we reach the ‘terminator’ noroute. This would result in a linear notation like

[tottenham_court_road,leicester_square]

Lists

For user-defined functors, such a linear notation is not available. However, Prolog provides a built-in ‘datatype’ called lists, for which both the tree-like notation and the linear notation may be used.

The functor for lists is . (dot), which takes two arguments: the first element of the list (which may be any object), and the rest of the list (which must be a list). The list terminator is the special symbol [], denoting the empty list.

Lists (ctd.)

For instance, the term

.(a,.(b,.(c,[])))

denotes the list consisting of a followed by b followed by c (fig. 1.7).

Figure 1.7.

The list [a,b,c] as a tree.

Lists (ctd.)

Alternatively, we may use the linear notation, which uses square brackets:

[a,b,c]

To increase readability of the tree-like notation, instead of

.(First,Rest)

one can also write

[First|Rest]

Lists (ctd.)

Note that Rest is a list: e.g., [a,b,c] is the same list as [a|[b,c]]. a is called the head of the list, and [b,c] is called its tail. Finally, to a certain extent the two notations can be mixed: at the head of the list, you can write any number of elements in linear notation. For instance,

[First,Second,Third|Rest]

denotes a list with three or more elements.

Lists (ctd.)

Exercise 1.4.

A list is either the empty list [], or a non-empty list [First|Rest] where Rest is a list. Define a relation list(L), which checks whether L is a list. Adapt it such that it succeeds only for lists of (i) even length and (ii) odd length.

Reachability with lists)

The recursive nature of such datastructures makes it possible to ignore the size of the objects, which is extremely useful in many situations. For instance, the definition of a route between two underground stations does not depend on the length of the route; all that matters is whether there is an intermediate station or not. For both cases, there is a clause.

Expressing the route as a list, we can state the final definition of the reachability relation:

reachable(X,Y,[]):-connected(X,Y,L).
reachable(X,Y,[Z|R]):-connected(X,Z,L),
                      reachable(Z,Y,R).

Reachability with lists) (ctd.)

The query ?-reachable(oxford_circus,charing_cross,R) now results in the following answers:

{ R → [tottenham_court_road,leicester_square] }
{ R → [piccadilly_circus] }
{ R → [piccadilly_circus, leicester_square] }

Note that Prolog writes out lists of fixed length in the linear notation.

Pattern matching with lists

Should we for some reason want to know from which station Charing Cross can be reached via a route with four intermediate stations, we should ask the query

?-reachable(X,charing_cross,[A,B,C,D])

which results in two answers:

{ X → bond_street , A → green_park , B → oxford_circus , 
  C → tottenham_court_road , D → leicester_square }
{ X → bond_street , A → green_park , B → oxford_circus , 
  C → piccadilly_circus , D → leicester_square }.

Pattern matching with lists (ctd.)

Exercise 1.5.

Construct a query asking for a route from Bond Street to Piccadilly Circus with at least two intermediate stations.

1.4 What else is there to know about clausal logic?

The main goal of this chapter has been to introduce the most important concepts in clausal logic, and how it can be used as a reasoning formalism. Needless to say, a subject like this needs a much more extensive and precise discussion than has been attempted here, and many important questions remain.

To name a few:

what are the limits of expressiveness of clausal logic, i.e. what can and what cannot be expressed?
what are the limits of reasoning with clausal logic, i.e. what can and what cannot be (efficiently) computed?
how are these two limits related: is it for instance possible to enhance reasoning by limiting expressiveness?

In order to start answering such questions, we need to be more precise in defining what clausal logic is, what expressions in clausal logic mean, and how we can reason with them. That means that we will have to introduce some theory in the next chapter. This theory will not only be useful for a better understanding of Logic Programming, but it will also be the foundation for most of the topics in Part III (Advanced reasoning techniques).

Another aim of Part I of this book is to teach the skill of programming in Prolog. For this, theory alone, however important, will not suffice. Like any programming language, Prolog has a number of built-in procedures and datastructures that you should know about. Furthermore, there are of course numerous programming techniques and tricks of the trade, with which the Prolog programmer should be familiar. These subjects will be discussed in Chapter 3. Together, Chapters 2 and 3 will provide a solid foundation for the rest of the book.

Want to practise more?

Go to https://swish.swi-prolog.org/example/movies.pl.