A distinct feature of logical reasoning is the separation between model theory and proof theory: a set of logical formulas determines the set of its models, but also the set of formulas that can be derived by applying inference rules. Another way to say the same thing is: logical formulas have both a declarative meaning and a procedural meaning. For instance, declaratively the order of the atoms in the body of a clause is irrelevant, but procedurally it may determine the order in which different answers to a query are found.
Because of this procedural meaning of logical formulas, logic can be used as a programming language. If we want to solve a problem in a particular domain, we write down the required knowledge and apply the inference rules built into the logic programming language. Declaratively, this knowledge specifies what the problem is, rather than how it should be solved.
The distinction between declarative and procedural aspects of problem solving is succinctly expressed by Kowalski’s equation
algorithm = logic + control
Here, logic refers to declarative knowledge, and control refers to procedural knowledge. The equation expresses that both components are needed to solve a problem algorithmically.
In a purely declarative programming language, the programmer would have no means to express procedural knowledge, because logically equivalent programs would behave identically. However, Prolog is not a purely declarative language, and therefore the procedural meaning of Prolog programs cannot be ignored. For instance, the order of the literals in the body of a clause usually influences the efficiency of the program to a large degree. Similarly, the order of clauses in a program often determines whether a program will give an answer at all.
Therefore, in this chapter we will take a closer look at Prolog’s inference engine and its built-in features (some of which are non-declarative). Also, we will discuss some common programming techniques.
Prolog’s proof procedure is based on resolution refutation in definite clause logic. Resolution refutation has been explained in the previous chapter. In order to turn it into an executable proof procedure, we have to specify how a literal to resolve upon is selected, and how the second input clause is found. Jointly, this is called a resolution strategy.
Consider the following program:
student_of(X,T):-follows(X,C),teaches(T,C). follows(paul,computer_science). follows(paul,expert_systems). follows(maria,ai_techniques). teaches(adrian,expert_systems). teaches(peter,ai_techniques). teaches(peter,computer_science).
The query
?-student_of(S,peter).
has two possible answers: { S
→ paul
} and { S
→ maria
}. In order to find these answers, we first resolve the query with the first clause, yielding
?-follows(S,C),teaches(peter,C).
Now we have to decide whether we will search for a clause which resolves on follows(S,C)
, or for a clause which resolves on teaches(peter,C)
. This decision is governed by a selection rule. Prolog’s selection rule is left to right, thus Prolog will search for a clause with a positive literal unifying with follows(S,C)
. There are three of these, so now we must decide which one to try first.
Prolog searches the clauses in the program top-down, so Prolog finds the answer { S
→ paul
} first. Note that the second choice leads to a dead end: the resolvent is
?-teaches(peter,expert_systems).
which doesn’t resolve with any clause in the program.
This process is called SLD-resolution: S for selection rule, L for linear resolution (which refers to the shape of the proof trees obtained), and D for definite clauses. Graphically, SLD-resolution can be depicted as in fig. 3.1.
Figure 3.1.
An SLD-tree.
This SLD-tree should not be confused with a proof tree: first, only the resolvents are shown (no input clauses or unifiers), and second, it contains every possible resolution step. Thus, every leaf of an SLD-tree which contains the empty clause □ corresponds to a refutation and hence to a proof tree; such a leaf is also called a success branch. An underlined leaf which does not contain □ represents a failure branch.
Exercise 3.1.
Draw the proof trees for the two success branches in fig. 3.1.
As remarked already, Prolog searches the clauses in the program top-down, which is the same as traversing the SLD-tree from left to right. This not only determines the order in which answers (i.e. success branches) are found: it also determines whether any answers are found at all, because an SLD-tree may contain infinite branches, if some predicates in the program are recursive.
brother_of(X,Y):-brother_of(Y,X). brother_of(paul,peter).
An SLD-tree for the query
?-brother_of(peter,B).
is depicted in fig. 3.2.
Figure 3.2.
An SLD-tree with infinite branches.
If we descend this tree taking the left branch at every node, we will never reach a leaf. On the other hand, if we take the right branch at every node, we almost immediately reach a success branch. Taking right branches instead of left branches in an SLD-tree corresponds to searching the clauses from bottom to top.
However, note that, even after this modification, the program still has some problems. For one thing, the query ?-brother_of(peter,B)
will be answered an infinite number of times, because there are infinitely many refutations of it. But, even worse, consider a query that does not have an answer, like
?-brother_of(peter,maria).
No matter the order in which the SLD-tree is descended, Prolog will never discover that the query has in fact no answer, simply because the SLD-tree is infinite. So, one should be careful with programs like the above, which define a predicate to be symmetric.
Another property of predicates which can cause similar problems is transitivity.Consider the following program:
brother_of(paul,peter). brother_of(peter,adrian). brother_of(X,Y):-brother_of(X,Z),brother_of(Z,Y).
The third clause ensures that ?-brother_of(paul,adrian)
is a logical consequence of the program.
?-brother_of(paul,B).
is depicted in fig. 3.3. Not only is this SLD-tree infinite, but the resolvents get longer and longer on deeper levels in the tree.
Figure 3.3.
An SLD-tree with infinite branches and expanding resolvents.
We have encountered two problems with SLD-resolution: (i) we might never reach a success branch in the SLD-tree, because we get ‘trapped’ into an infinite subtree, and (ii) any infinite SLD-tree causes the inference engine to loop if no (more) answers are to be found. The first problem means that Prolog is incomplete: some logical consequences of a program may never be found.
There exists a solution to this problem: if we descend the tree layer by layer rather than branch-by-branch, we will find any leaf before we descend to the next level. However, this also means that we must keep track of all the resolvents on a level, instead of just a single one. Therefore, this breadth-first search strategy needs much more memory than the depth-first strategy used by Prolog.
As for the second problem, we already saw that this is due to the semi-decidability of full clausal logic, which means that there is no general solution to it.
Exercise 3.2.
Draw the SLD-tree for the following program:
list([]). list([_H|T]):-list(T).
and the query
?-list(L).
As shown in the previous section, Prolog constantly searches the clauses in a program in order to reach a success branch in the SLD-tree for a query. If a failure branch is reached (i.e., a non-empty resolvent which cannot be reduced any further), Prolog has to ‘unchoose’ the last-chosen program clause, and try another one. This amounts to going up one level in the SLD-tree, and trying the next branch to the right. This process of reconsidering previous choices is called backtracking.
As an illustration, consider again the SLD-tree in fig. 3.1. The resolvent in the middle branch
:-teaches(peter,expert_systems)
cannot be reduced any further, and thus represents a failure branch. At that point, the stack contains (top-down) the previous resolvents
:-follows(S,C),teaches(peter,C)
?-student_of(S,peter)
The top one is popped from the stack; it has been most recently resolved with follows(paul,expert_systems)
, so we continue searching the program from that clause, finding follows(maria,ai_techniques)
as the next alternative.
A node in the SLD-tree which is not a leaf is called a choice point, because the subtree rooted at that node may contain several success branches, each of which may be reached by a different choice for a program clause to resolve with.
!
), because it cuts away (or prunes) part of the SLD-tree.
To illustrate the effect of cut, consider the following program.
parent(X,Y):-father(X,Y). parent(X,Y):-mother(X,Y). father(john,paul). mother(mary,paul).
The SLD-tree for the query
?-parent(john,C).
is given in fig. 3.4.
Figure 3.4.
SLD-tree for the query?-parent(john,C)
.
The answer given by Prolog is { C
→ paul
}. By asking whether there are any other answers, we force Prolog to backtrack to the most recent choice point for which there are any alternatives left, which is the root of the SLD-tree (i.e. the original query). Prolog tries the second clause for parent
, but discovers that this leads to a failure branch.
Of course, we know that this backtracking step did not make sense: if John is a father of anyone, he can’t be a mother. We can express this by adding a cut to the first parent
clause:
parent(X,Y):-father(X,Y),!. parent(X,Y):-mother(X,Y). father(john,paul). mother(mary,paul).
The cut says: once you’ve reached me, stick to all the variable substitutions you’ve found after you entered my clause. That is: don’t try to find any alternative solutions to the literals left of the cut, and also: don’t try any alternative clauses for the one in which the cut is found.
Given this modified program, the SLD-tree for the same query is shown in fig. 3.5.
Figure 3.5.
The effect of cut.
Since !
is true by definition, the resolvent :-!
reduces to the empty clause. The shaded part represents the part of the SLD-tree which is pruned as a result of the cut. That is: every alternative at choice points below and including ?-parent(john,C)
, which are on the stack when the cut is reached, are pruned.
A cut is harmless if it does not cut away subtrees containing success branches. If a cut prunes success branches, then some logical consequences of the program are not returned as answers, resulting in a procedural meaning different from the declarative meaning. Cuts of the first kind are called green cuts, while cuts of the second kind are called red cuts. A green cut merely stresses that the conjunction of literals to its left is deterministic: it does not give alternative solutions. In addition, it signifies that if those literals give a solution, the clauses below it will not result in any alternatives.
This seems to be true for the above program: John is the father of only one child, and no-one is both a father and a mother. However, note that we only analysed the situation with regard to a particular query. We can show that the cut is in fact red by asking the query
?-parent(P,paul).
The answer { P
→ mary
} is pruned by the cut (fig. 3.7). That is, the literal father(X,Y)
left to the cut is only deterministic if X
is instantiated
(is substituted by a non-variable value).
Figure 3.7.
A success branch is pruned.
Note that success branches are also pruned for the first query if John has several children:
parent(X,Y):-father(X,Y),!. parent(X,Y):-mother(X,Y). father(john,paul). father(john,peter). mother(mary,paul). mother(mary,peter).
The SLD-tree for the query
?-parent(john,C).
is given in fig. 3.8.
Figure 3.8.
Another success branch is pruned.
Indeed, the second answer { C
→ peter
} is pruned by the cut. This clearly shows that the effect of a cut is not only determined by the clause in which it occurs but also by other clauses. Therefore, the effect of a cut is often hard to understand.
Programs with cuts are not only difficult to understand; this last example also shows that their procedural interpretation (the set of answers they produce to a query) may be different from their declarative interpretation (the set of its logical consequences).
Exercise 3.3.
Draw the SLD-tree for the query
?-likes(A,B)
given the following program:
likes(peter,Y):-friendly(Y). likes(T,S):-student_of(S,T). student_of(maria,peter). student_of(paul,peter). friendly(maria).
Add a cut in order to prune away one of the answers { A
→ peter
, B
→ maria
}, and indicate the result in the SLD-tree. Can this be done without pruning away the third answer?
The following program computes the maximum of two integers:
max(M,N,M):- M >= N. max(M,N,N):- M =< N.
>=
and =<
are built-in predicates with meaning ‘greater than or equal’ and ‘less than or equal’, respectively
[8]
. Declaratively, the program captures the intended meaning, but procedurally there are two different ways to solve queries of the form ?-max(N,N,M)
.
max
predicate are equal. We could of course remove one of the equality symbols, but suppose that we use a cut instead:
max(M,N,M):- M >= N,!. max(_M,N,N).
With a red cut, this program can only be understood procedurally. The question is: does the procedural meaning correspond to the intended meaning?
?-max(5,3,3).
succeeds: the cut is never reached, because the literal in the query does not unify with the head of the first clause. The second program is in fact a very bad program: the declarative and procedural meanings differ, and neither of them captures the intended meaning.
Exercise 3.4.
Show that this cut is red, by drawing an SLD-tree in which a success branch is pruned.
The procedural meaning of the program would be correct if its use is restricted to queries with uninstantiated third argument. It illustrates a very common use of cut: to ensure that the bodies of the clauses are mutually exclusive. In general, if we have a program of the form
p:-q,!,r.
p:-s.
its meaning is something like
p:-q,r.
p:-not_q,s.
How should not_q
be defined, in order to make the second program work? If q
succeeds, not_q
should fail. This is expressed by the following clause:
not_q:-q,fail
where fail
is a built-in predicate, which is always false. If q
fails, not_q
should succeed. This can be realised by the program
not_q:-q,!,fail.
not_q.
The cut in the first clause is needed to prevent backtracking to the second clause when q
succeeds.
This approach is not very practical, because it only works for a single proposition symbol, without variables. We would like to treat the literal to be negated as a parameter, as in
not(Goal):- /* execute Goal, */ !,fail.
not(Goal).
To execute a goal passed on as a variable we can use the built-in predicate call/1
, which takes a goal as argument and succeeds if and only if execution of that goal succeeds:
not(Goal):- call(Goal) !,fail.
not(Goal).
not
and call
are called meta-predicates, that take formulas from the same logical language in which they are written as arguments. As we will see in later chapters, meta-predicates play an important role in this book.
We illustrate the operation of not
by means of the following propositional program:
p:-q,r.
p:-not(q),s.
s.
and the query ?-p
. The SLD-tree is shown in fig. 3.9. The first clause for p
leads to a failure branch, because q
cannot be proved. The second clause for p
is tried, and not(q)
is evaluated by trying to prove q
. Again, this fails, which means that the second clause for not
is tried, which succeeds. Thus, not(q)
is proved by failing to prove q
! Therefore, this kind of negation is called negation as failure.
Figure 3.9.
SLD-tree with not
.
Fig. 3.9 shows, that Prolog tries to prove q
twice. Consequently, the program with not
is slightly less efficient than the version with cut:
p:-q,!,r.
p:-s.
s.
which leads to the SLD-tree shown in fig. 3.10. Here, q
is tried only once. However, in general we prefer the use of not
, because it leads to programs of which the declarative meaning corresponds more closely to the procedural meaning.
Figure 3.10.
Equivalent SLD-tree with cut.
In the following program, :-not(q)
fails because :-q
succeeds:
p:-not(q),r.
p:-q.
q.
r.
The SLD-tree for the query ?-p
is shown in fig. 3.11. Since q
succeeds, fail
ensures that not(q)
fails. The cut is needed to ensure that everything following the not
is pruned, even if it contains a success branch.
Figure 3.11.
:-not(q)
fails because :-q
succeeds.
The implementation of not
illustrated above can lead to problems if variables are involved. Take a look at the following program:
bachelor(X):-not(married(X)),man(X). man(fred). man(peter). married(fred).
Exercise 3.5.
Draw the SLD-trees for the queries ?-bachelor(fred)
and ?-bachelor(peter)
.
Consider the query
?-bachelor(X).
for which the SLD-tree is depicted in fig. 3.12.
Figure 3.12.
There are no bachelors?!
According to negation as failure, Prolog tries to prove not(married(X))
by trying married(X)
first. Since this succeeds for X
= fred
, the cut is reached and the success branch to the right (representing the correct answer { X
→ peter
}) is pruned. Thus, :‑not(married(X))
fails because :‑married(X)
succeeds for one value of X
. That is, not(married(X))
is interpreted as ‘it is false that somebody is married’, or equivalently, ‘nobody is married’.
bachelor(X):-not(married(X)),man(X)
is interpreted as ‘ X
is a bachelor if nobody is married and X
is a man’, which is of course not as intended.
Thus, if G
is instantiated to a goal containing variables at the time not(G)
is called, the result may be not in accordance with negation as failure. It is the programmer’s responsibility to avoid this. A simple remedy that will often work is to ensure the grounding of G
by literals preceding not(G)
in the body of the clause, i.e.
bachelor(X):-man(X),not(married(X)). man(fred). man(peter). married(fred).
Exercise 3.6.
Show that the modified program produces the right answer, by drawing the SLD-tree for the query ?-bachelor(X)
.
Let’s summarise the points made about negation in Prolog. It is often used to ensure that only one of several possible clauses is applicable. The same effect can be achieved by means of cut, but in general we prefer the use of not
, although it is somewhat less efficient
[10]
. not
is supplied by Prolog as a meta-predicate (i.e. a predicate which takes formulas from the same logical language in which it is written as arguments). It is only a partially correct implementation of negation as failure, since it does not operate correctly when its argument is a goal containing variables.
Consider the following propositional program:
p:-q,r,s,!,t.
p:-q,r,u.
q.
r.
u.
Exercise 3.7.
Show that the query ?-p
succeeds, but that q
and r
are tried twice.
This inefficiency can be avoided by putting s,!
at the beginning of the body of the first clause. However, in full clausal logic the goals preceding s
might supply necessary variable bindings, which requires them to be called first. A possible solution would be the introduction of an extra proposition symbol:
p:-q,r,if_s_then_t_else_u.
if_s_then_t_else_u:-s,!,t.
if_s_then_t_else_u:-u.
Exercise 3.8.
Show that q
and r
are now tried only once.
Just as we did with not
, we can rewrite this new proposition symbol to a generally applicable meta-predicate:
if_then_else(S,T,U):-S,!,T.
if_then_else(S,T,U):-U.
Note that we can nest applications of if_then_else
, for instance
if_then_else_else(P,Q,R,S,T):-
if_then_else(P,Q,if_then_else(R,S,T)).
Unfolding the definition of if_then_else
yields
if_then_else_else(P,Q,R,S,T):-P,!,Q.
if_then_else_else(P,Q,R,S,T):-R,!,S.
if_then_else_else(P,Q,R,S,T):-T.
which clearly shows the meaning of the predicate: ‘if P then Q else if R then S else T ’.
P->Q;R
for if-then-else; the nested variant then becomes P->Q;(R->S;T)
. The parentheses are not strictly necessary, but in general the outermost if-then-else literal should be enclosed in parentheses.
diagnosis(Patient,Condition):-
temperature(Patient,T),
( T=<37 -> blood_pressure(Patient,Condition)
; T>37,T<38 -> Condition=ok
; otherwise -> diagnose_fever(Patient,Condition)
).
otherwise
is always assigned the truthvalue true, so the last rule applies if all the others fail.
not
and if-then-else show that many uses of cut can be replaced by higher-level constructs, which are easier to understand. However, this is not true for every use of cut. For instance, consider the following program:
play(Board,Player):-
lost(Board,Player).
play(Board,Player):-
find_move(Board,Player,Move),
make_move(Board,Move,NewBoard),
next_player(Player,Next),
play(NewBoard,Next).
This program plays a game by recursively looking for best moves. Suppose one game has been finished; that is, the query ?-play(Start,First)
(with appropriate instantiations of the variables) has succeeded. As usual, we can ask Prolog whether there are any alternative solutions. Prolog will start backtracking, looking for alternatives for the most recent move, then for the move before that one, and so on. That is, Prolog has maintained all previous board situations, and every move made can be undone.
Note that this only works if the recursive call is the last call in the body. In general, it is advisable to write your recursive predicates like play
above: the non-recursive clause before the recursive one, and the recursive call at the end of the body. A recursive predicate written this way is said to be tail recursive.
In Logic Programming, recursion is the only looping control structure. Consequently, recursive datatypes such as lists can be expressed very naturally. Natural numbers also have a recursive nature: ‘0 is a natural number, and if X is a natural number, then the successor of X is also a natural number’. In Prolog, this is expressed as
nat(0). nat(s(X)):-nat(X).
Addition of natural numbers is defined in terms of successors:
add(0,X,X). add(s(X),Y,s(Z)):-add(X,Y,Z).
The following query asks for the sum of two and three:
?-add(s(s(0)),s(s(s(0))),Z).
Z = s(s(s(s(s(0)))))
We can also find an X such that the sum of X and Y is Z (i.e., subtract Y from Z):
?-add(X,s(s(s(0))),s(s(s(s(s(0)))))).
X = s(s(0))
We can even find all X and Y which add up to a given sum. Thus, this program is fully declarative. Similarly, multiplication is repeated addition:
mul(0,_X,0). mul(s(X),Y,Z):-mul(X,Y,Z1),add(Y,Z1,Z).
There are two problems with this approach to representing and manipulating natural numbers. First, naming natural numbers by means of the constant symbol 0
and the functor s
is very clumsy, especially for large numbers. Of course, it would be possible to write a translator from decimal notation to successor notation, and back. However, the second problem is more fundamental: multiplication as repeated addition is extremely inefficient compared to the algorithm for multiplicating numbers in decimal notation. Therefore, Prolog has built-in arithmetic facilities, which we will discuss now.
Consider the arithmetic expression 5+7-3
. Prolog will view this expression as the term +(5,-(7,3))
, with the functors +
and -
written as infix operators. We want to evaluate this expression, i.e. we want a single numerical value which represents somehow the same number as the expression. A program for doing this would look something like
is(V,E1+E2):-
is(V1,E1),is(V2,E2),
fast_add(V1,V2,V).
is(V,E1-E2):-
is(V1,E1,),is(V2,E2),
fast_sub(V1,V2,V).
is(E,E):-
number(E).
Here, fast_add
and fast_sub
represent the fast, built-in procedures for addition and subtraction, which are not directly available to the user. These procedures are not reversible: its first two arguments must be instantiated. Therefore, the predicate is
will include a test for groundness of its second argument (the arithmetic expression), and will quit with an error-message if this test fails.
The is
predicate is a built-in feature of Prolog, and is declared as an infix operator. Its behaviour is illustrated by the following queries:
?-X is 5+7-3
X = 9
?-9 is 5+7-3
Yes
?-9 is X+7-3
Error in arithmetic expression
?-X is 5*3+7/2
X = 18.5
% Try these queries here.
The last example shows, that arithmetic expressions obey the usual precedence rules (which can be overruled using parentheses). Also, note that the is
predicate can handle real numbers.
Prolog also provides a built-in predicate =
, but this predicate behaves quite differently from is
, since it performs unification rather than arithmetic evaluation (see also section 2.3). The following queries illustrate the operation of =
:
?-X = 5+7-3
X = 5+7-3
?-9 = 5+7-3
No
?-9 = X+7-3
No
?-X = Y+7-3
X = _947+7-3
Y = _947
?-X = f(X)
X = f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(
f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(
f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(
Error: term being written is too deep
% Try these queries here. Is the answer to the last query really as described here?
The first query just unifies X
with the term 5+7-3
(i.e. +(5,-(7,3))
), which of course succeeds. In the second and third query, we try to unify a constant with a complex term, which fails. The fourth query succeeds, leaving Y
unbound (_947
is an internal variable name, generated by Prolog).
The fifth query illustrates that Prolog indeed omits the occur check (section 2.3) in unification: the query should have failed, but instead it succeeds, resulting in the circular binding { X
→ f(X)
}. The problem only becomes apparent when Prolog tries to write the resulting term, which is infinite. Just to stress that Prolog quite happily constructs circular bindings, take a look at the following strange program:
strange:-X=f(X).
The query ?-strange
succeeds, and since there is no answer substitution, it is not apparent that there is a circular binding involved.
Exercise 3.9.
Write a predicate zero(A,B,C,X)
which, given the coefficients a, b and c, calculates both values of x for which ax2+bx+c=0.
Finally, we mention that Prolog provides a number of other useful arithmetic predicates, including the inequality tests <
and >
, and their reflexive counterparts =<
and >=
. For these tests, both arguments should be instantiated to numbers.
The condition that the righthand-side of is
should not contain variables sometimes determines the ordering of literals in the body of the clause. For instance, in the program below, which computes the length of a list, the is
literal should be placed after the recursive length
call, which instantiates M
. This means that the resolvent first collects as many is
literals as there are elements in the list, before doing the actual calculation. Each of these literals contains some ‘local’ variables that require some space in memory. The total memory requirements are thus proportional to the depth of the recursion.
naive_length([],0). naive_length([_H|T],N):-naive_length(T,M),N is M+1.
Exercise 3.10.
Draw the proof tree for the query ?-naive_length([a,b,c],N)
.
Programs with tail recursion need less memory because they do all the work on one recursive level before proceeding to the next. There is a common trick to transform even the length
predicate above into a tail recursive program, using an auxiliary argument called an accumulator.
length_acc(L,N):-length_acc(L,0,N). length_acc([],N,N). length_acc([_H|T],N0,N):-N1 is N0+1,length_acc(T,N1,N).
length_acc(L,N0,N)
is true if N
is the number of elements in L
plus N0
. Initialising N0
to 0
results in N
returning the length of L
. Note that the actual counting is done by the second argument: only when the list is empty is the third argument unified with the second argument. The main point is that, since the accumulator is given an initial value of 0
, it is always instantiated, such that the is
literal can be placed before the recursive call.
Exercise 3.11.
Draw the proof tree for the query ?-length_acc([a,b,c],N)
.
Accumulators can be used in very many programs. Suppose we want to reverse the order of elements in a list. We could do this by recursively reversing the tail of the list, and putting the head at the end of the result:
naive_reverse([],[]). naive_reverse([H|T],R):-naive_reverse(T,R1),append(R1,[H],R). append([],Y,Y). append([H|T],Y,[H|Z]):-append(T,Y,Z).
This predicate is called ‘naive’ because a lot of unnecessary work is done by the append
calls in the recursive clause.
Exercise 3.12.
Draw the proof tree for the query ?-naive_reverse([a,b,c],R)
.
By using an accumulator, we can get rid of the append
predicate, as follows:
reverse(X,Y):- reverse(X,[],Y). reverse([],Y,Y). reverse([H|T],Y0,Y):- reverse(T,[H|Y0],Y).
reverse(X,Y0,Y)
is true if Y
consists of the reversal of X
followed by Y0
. Initialising Y0
to []
results in Y
returning the reversal of X
.
The use of an accumulator in this more efficient program for reversing a list is closely related to another programming trick for increasing the efficiency of list handling. The idea is not to represent a list by a single term, but instead by a pair of terms L1-L2
, such that the list actually represented is the difference between L1
and L2
. The term L1-L2
is appropriately called a difference list; L1
is called the plus list, and L2
is called the minus list.
[a,b,c,d]-[d]
represents the simple list [a,b,c]
, as does the difference list [a,b,c,1234,5678]-[1234,5678]
, and even the difference list [a,b,c|X]-X
. The last difference list can be seen as summarising every possible difference list representing the same simple list, by introducing a variable for the part which is not contained in the simple list.
As was remarked above, reverse(X,Y0,Y)
is true if Y
consists of the reversal of X
followed by Y0
. Another way to say the same thing is that the reversal of X
is the difference between Y
and Y0
. That is, the reversal of X
is represented by the difference list Y-Y0
! We can make this explicit by a small syntactic change to reverse
, resulting in the following program:
reverse_dl(X,Y):- reverse(X,Y-[]). reverse([],Y-Y). reverse([H|T],Y-Y0):- reverse(T,Y-[H|Y0]).
For instance, the third clause in this program says: if the reversal of T
is represented by the difference list Y-[H|Y0]
, then adding H
to the head of T
is the same as removing H
from the minus list in the difference list.
If the minus list is a variable, it can be used as a pointer to the end of the represented list. It is this property which makes difference lists so useful. For instance, if we unify [a,b,c|X]-X
with Y-[d,e]
, we get Y=[a,b,c,d,e]
— we have managed to append two lists together in a single unification step!
[a,b,c|XMinus]-XMinus
and
[d,e|YMinus]-YMinus
we must unify XMinus
with [d,e|YMinus]
(the plus list of the second difference list), such that the first difference list becomes
[a,b,c,d,e|YMinus]-[d,e|YMinus]
Combining the plus list of this difference list with YMinus
, we get exactly what we want.
In general, given two difference lists XPlus-XMinus
and YPlus-YMinus
, we unify XMinus
with YPlus
, and the result is given by XPlus-YMinus
(fig. 3.13):
append_dl(XPlus-XMinus,YPlus-YMinus,XPlus-YMinus):-
XMinus=YPlus.
or even shorter
append_dl(XPlus-YPlus,YPlus-YMinus,XPlus-YMinus).
Figure 3.13.
Appending two difference lists: the ‘length’ of XMinus
is adjusted by unification with YPlus
, the result is given by XPlus-YMinus
.
Appending a simple list to another simple list of n elements requires n resolution steps; appending two difference lists requires no resolution at all, just one unification. Using difference lists is almost always a good idea if you have to do a lot of list processing.
Exercise 3.13.
In the naive_reverse
predicate, represent the reversed list by a difference list, use append_dl
instead of append
, and show that this results in the predicate reverse_dl
by unfolding the definition of append_dl
.
Suppose we need a program to determine, given two lists of persons of equal length, whether a person in the first list is the parent of the corresponding person in the second list. The following program will do the job:
parents([],[]).
parents([P|Ps],[C|Cs]):-
parent(P,C),
parents(Ps,Cs).
We can generalise this program by including the relation which must hold between corresponding elements of the two lists as a parameter:
rel(R,[],[]).
rel(R,[X|Xs],[Y|Ys]):-
R(X,Y),
rel(R,Xs,Ys).
A term like R(X,Y)
is allowed at the position of an atom in the body of a clause, as long as it is correctly instantiated at the time it is called.
Some Prolog interpreters don’t allow this, in which case you must explicitly construct the literal by means of the built-in predicate ‘ =..
’ (sometimes called univ). It is a fully declarative predicate, which can both be used to construct a term from a list of arguments preceded by a functor, or to decompose a term into its constituents:
?-Term =.. [parent,X,peter]
Term = parent(X,peter)
?-parent(maria,Y) =.. List
List = [parent,maria,Y]
‘ =..
’ is declared as an infix operator in Prolog.
Exercise 3.14.
Rewrite the program for rel
, using =..
The predicate rel
is called a second-order predicate, because it takes a (first-order) predicate as an argument
[11]
. We can now define the parents
predicate as
parents(Ps,Cs):-rel(parent,Ps,Cs).
Suppose now you have the following facts in your program, and you want to collect all the children of a particular parent in a list:
parent(john,peter). parent(john,paul). parent(john,mary). parent(mick,davy). parent(mick,dee). parent(mick,dozy).
Of course, it is easy to generate all the children upon backtracking; the problem is to collect them in a global list. To this end, Prolog provides the second-order predicates findall
, bagof
, and setof
. For instance, we could use the following program and query:
children(Parent,Children):- findall(C,parent(Parent,C),Children).
?-children(john,Children).
Children = [peter,paul,mary]
In general, the query
?-findall(X,Goal,ListofX)
generates all the possible solutions of the query ?‑Goal
, recording the substitutions for X
for each of these solutions in the list ListofX
(Goal
must be instantiated to a term representing a Prolog goal).
The bagof
predicate acts similarly. However, its behaviour is different when the goal contains free variables. Consider the query
?-bagof(C,parent(P,C),L)
in which the variable P
is unbound. This query has two possible interpretations: ‘find a parent and a list of his children’, and ‘find the list of children that have a parent ’.
P
and a list of P
’s children, which means that there are two solutions:
?-bagof(C,parent(P,C),L).
C = _951
P = john
L = [peter,paul,mary];
C = _951
P = mick
L = [davy,dee,dozy]
In the second case, the goal to prove is ‘there exists a P
such that parent(P,C)
is true’, which means that the variable P
is existentially quantified. This is signalled by prefixing the goal with P^
:
?-bagof(C,P^parent(P,C),L).
C = _957
P = _958
L = [peter,paul,mary,davy,dee,dozy]
The query
?-findall(C,parent(P,C),L).
(without existential quantification) can only generate this second solution.
Finally, Prolog provides the predicate setof
, which acts just like bagof
, except that the resulting list is sorted and does not contain duplicates. Thus, setof
is slightly less efficient than bagof
, and the latter is preferred in cases where the list of solutions is known not to contain duplicates.
Exercise 3.15.
Write a program which sorts and removes duplicates from a list, using setof
.
Prolog represents a clause Head:-Body
in the same way as a term :-(Head,Body)
. Thus, it is easy to write programs that manipulate clauses. In the first case, ‘ :-
’ is treated as a predicate, and in the second case it is treated as a functor. The combination of these two interpretations occurs frequently in Prolog programs, and can be applied to any predicate p
.
p
as a predicate occurs on the object-level, and the interpretation as a functor occurs on the meta-level. (Note that the difference between meta-predicates and higher-order predicates is that meta-predicates take object-level clauses as arguments, while the latter take lower-order predicates as arguments.)
For instance, suppose we have the following biological knowledge, expressed as propositional if-then rules:
% if A and B then C means if(then(and(A,B),C)) :-op(900,fx,if). :-op(800,xfx,then). :-op(700,yfx,and). % object-level rules if has_feathers and lays_eggs then is_bird. if has_gills and lays_eggs then is_fish. if tweety then has_feathers. if tweety then lays_eggs.
Suppose we want to prove that Tweety is a bird. That is, we want to show that the rule
if tweety then is_bird
follows logically from the given rules.
% meta-program derive(if Assumptions then Goal):- if Body then Goal, derive(if Assumptions then Body). derive(if Assumptions then Goal1 and Goal2):- derive(if Assumptions then Goal1), derive(if Assumptions then Goal2). derive(if Assumptions then Goal):- assumed(Goal,Assumptions). assumed(A,A). assumed(A,A and _As). assumed(A,_B and As):- assumed(A,As).
The three clauses for the derive
predicate represent the three possible cases:
As explained above, if
is a predicate on the object-level, and a functor on the meta-level.
Exercise 3.16.
Draw the SLD-tree for the query
?-derive(if tweety then is_bird).
Since propositional definite clauses are similar to the above if-then rules, one could view this program as a propositional Prolog simulator. In fact, it is possible to push the resemblance closer, by adopting the Prolog-representation of clauses at the object-level.
:-
’ is not directly available as an object-level predicate. Instead, Prolog provides the built-in predicate clause
: a query ?‑clause(H,B)
succeeds if H:-B
unifies with a clause in the internal Prolog database (if H
unifies with a fact, B
is unified with true
).
if Assumptions then Goal
; instead, the Assumptions
are added to the object-level program, from which a proof of Goal
is attempted.
Following these observations, the predicate derive
is changed as follows:
prove(Goal):-
clause(Goal,Body),
prove(Body).
prove((Goal1,Goal2)):-
prove(Goal1),
prove(Goal2).
prove(true).
This program nicely reflects the process of constructing a resolution proof:
Some Prolog interpreters have problems if clause
is called with the first argument instantiated to true
or a conjunction, because true
and ‘ ,
’ (comma) are built-in predicates. To avoid these problems, we should add the conditions not A=true
and not A=(X,Y)
to the first clause. A less declarative solution is to reorder the clauses and use cuts:
prove(true):-!.
prove((A,B)):-!,
prove(A),
prove(B).
prove(A):-
/* not A=true, not A=(X,Y) */
clause(A,B),
prove(B).
A meta-program interpreting programs in the same language in which it is written is called a meta-interpreter. In order to ‘lift’ this propositional meta-interpreter to clauses containing variables, it is necessary to incorporate unification into the third clause.
unify
and apply
, such that unify(T1,T2,MGU,T)
is true if T
is the result of unifying T1
and T2
with most general unifier MGU
, and apply(T,Sub,TS)
is true if TS
is the term obtained from T
by applying substitution Sub
. The meta-interpreter would then look like this:
prove_var(true):-!.
prove_var((A,B)):-!,
prove(A),
prove(B).
prove_var(A):-
clause(Head,Body),
unify(A,Head,MGU,Result),
apply(Body,MGU,NewBody),
prove_var(NewBody).
Prolog’s own unification predicate =
does not return the most general unifier explicitly, but rather unifies the two original terms implicitly. Therefore, if we want to use the built-in unification algorithm in our meta-interpreter, we do not need the apply
predicate, and we can write the third clause as
prove_var(A):-
clause(Head,Body),
A=Head,
prove_var(Body)
If we now change the explicit unification in the body of this clause to an implicit unification in the head, we actually obtain the propositional meta-interpreter again! That is, while this program is read declaratively as a meta-interpreter for propositional programs, it nevertheless operates procedurally as an interpreter of first-order clauses (fig. 3.14).
Note that this meta-interpreter is able to handle only ‘pure’ Prolog programs, without system predicates like cut or is
, since there are no explicit clauses for such predicates.
Figure 3.14.
The prove
meta-interpreter embodies a declarative implementation of the resolution proof procedure, making use of built-in unification.
Exercise 3.17.
Draw the SLD-tree for the query ?-prove(is_bird(X))
, given the following clauses:
is_bird(X):-has_feathers(X),lays_eggs(X).
is_fish(X):-has_gills(X),lays_eggs(X).
has_feathers(tweety).
lays_eggs(tweety).
A variety of meta-interpreters will be encountered in this book. Each of them is a variation of the above ‘canonical’ meta-interpreter in one of the following senses:
Here, we will give two example variations. In the first example, we change the meta-interpreter in order to handle general clauses by means of negation as failure (case 3). All we have to do is to add the following clause:
prove(not A):-
not prove(A)
This clause gives a declarative description of negation as failure.
The second variation extracts additional information from the SLD proof procedure by means of a proof tree (case 4). To this end, we need to make a slight change to the meta-interpreter given above. The reason for this is that the second clause of the original meta-interpreter breaks up the current resolvent if it is a conjunction, whereas in a proof tree we want the complete resolvent to appear.
% meta-interpreter with complete resolvent
prove_r(true):-!.
prove_r((A,B)):-!,
clause(A,C),
conj_append(C,B,D),
prove_r(D).
prove_r(A):-
clause(A,B),
prove_r(B).
%%% conj_append/3: see Appendix A.2
We now extend prove_r/1
with a second argument, which returns the proof tree as a list of pairs p(Resolvent,Clause)
:
% display a proof tree prove_p(A):-prove_p(A,P),write_proof(P). % prove_p(A,P) <- P is proof tree of A prove_p(true,[]):-!. prove_p((A,B),[p((A,B),(A:-C))|Proof]):-!, clause(A,C), conj_append(C,B,D), prove_p(D,Proof). prove_p(A,[p(A,(A:-B))|Proof]):- clause(A,B), prove_p(B,Proof). write_proof([]):- write('...............[]'),nl. write_proof([p(A,B)|Proof]):- write((:-A)),nl, write('.....|'),write('..........'),write(B),nl, write('.....|'),write('..................../'),nl, write_proof(Proof).
For instance, given the following clauses:
student_of(S,T):-teaches(T,C),follows(S,C).
teaches(peter,cs).
teaches(peter,ai).
follows(maria,cs).
follows(paul,ai).
and the query ?-prove_p(student_of(S,T))
, the program writes the following proof trees:
:-student_of(peter,maria)
| student_of(peter,maria):-teaches(peter,cs),follows(maria,cs)
| /
:-(teaches(peter,cs),follows(maria,cs))
| teaches(peter,cs):-true
| /
:-follows(maria,cs)
| follows(maria,cs):-true
| /
[]
:-student_of(peter,paul)
| student_of(peter,paul):-teaches(peter,ai),follows(paul,ai)
| /
:-(teaches(peter,ai),follows(paul,ai))
| teaches(peter,ai):-true
| /
:-follows(paul,ai)
| follows(paul,ai):-true
| /
[]
Note that these are propositional proof trees, in the sense that all substitutions needed for the proof have already been applied. If we want to collect the uninstantiated program clauses in the proof tree then we should make a copy of each clause, before it is used in the proof:
prove_p((A,B),[p((A,B),Clause)|Proof]):-!,
clause(A,C),
copy_term((A:-C),Clause), % make copy of the clause
conj_append(C,B,D),
prove_p(D,Proof)
The predicate copy_term/2
makes a copy of a term, with all variables replaced by new ones. It is a built-in predicate in many Prolog interpreters, but could be defined by means of assert/2
and retract/2
(see Appendix A.2 for details).
At the end of this chapter, we spend a few words on the methodology of writing Prolog programs. Given a problem to solve, how do I obtain the program solving the problem? This is the fundamental problem of software engineering. Here, we can only scratch the surface of this question: we will concentrate on the subtask of writing relatively simple predicates which use no more than two other predicates.
Consider the following problem: define a predicate which, given a number n, partitions a list of numbers into two lists: one containing numbers smaller than n, and the other containing the rest. So we need a predicate partition/4
:
% partition(L,N,Littles,Bigs) <- Littles contains numbers
% in L smaller than N,
% Bigs contains the rest
Since the only looping structure of Prolog is recursion, simple predicates like this will typically be recursive. This means that
For list predicates, the recursion argument is typically a list, and the distinction is typically between empty and non-empty lists. For the partition/4
predicate, the recursion argument is the first list. The base case is easily identified: the empty list is partitioned in two empty lists, no matter the value of N
. This gives us the following skeleton:
partition([],N,[],[]).
partition([Head|Tail],N,?Littles,?Bigs):-
/* do something with Head */
partition(Tail,N,Littles,Bigs).
The question marks denote output arguments, whose relation to the variables in the recursive call still has to be decided. It should be noted that not all predicates are tail recursive, so it is not yet known whether the recursive call will be last indeed. Notice also that the output arguments in the recursive call have been given meaningful names, which is, in general, a good idea.
Once we have ‘chopped off’ the first number in the list, we have to do something with it. Depending on whether it is smaller than N
or not, it has to be added to the Littles
or the Bigs
. Suppose Head
is smaller than N
:
partition([Head|Tail],N,?Littles,?Bigs):-
Head < N,
partition(Tail,N,Littles,Bigs)
Thus, Head
must be added to Littles
. In this case, it does not matter in which position it is added: obviously, the most simple way is to add it to the head of the list:
?Littles = [Head|Littles]
In such cases, where output arguments are simply constructed by unification, the unification is performed implicitly in the head of the clause (the fourth argument remains unchanged):
partition([Head|Tail],N,[Head|Littles],Bigs):-
Head < N,
partition(Tail,N,Littles,Bigs)
A second recursive clause is needed to cover the case that Head
is larger than or equal to N
, in which case it must be added to Bigs
. The final program looks as follows:
% partition(L,N,Littles,Bigs) <- Littles contains numbers % in L smaller than N, % Bigs contains the rest partition([],_N,[],[]). partition([Head|Tail],N,[Head|Littles],Bigs):- Head < N, partition(Tail,N,Littles,Bigs). partition([Head|Tail],N,Littles,[Head|Bigs]):- Head >= N, partition(Tail,N,Littles,Bigs).
The approach taken here can be formulated as a general strategy for writing Prolog predicates. The steps to be performed according to this strategy are summarised below:
Notice that step (4) comprises most of the work, while the other steps are meant to make this work as easy as possible.
Exercise 3.18.
Implement a predicate permutation/2
, such that permutation(L,P)
is true if P
contains the same elements as the list L
but (possibly) in a different order, following these steps. (One auxiliary predicate is needed.)
As a second example, consider the problem of sorting a list of numbers. The declarative specification is as follows:
%mySort(L,S) <- S is a sorted permutation of list L
Note that this specification can immediately be translated to Prolog:
mySort(L,S):-
permutation(L,S),
sorted(S).
This program first guesses a permutation of L
, and then checks if the permutation happens to be sorted. Declaratively, this program is correct; procedurally, it is extremely inefficent since there are n! different permutations of a list of length n. Thus, we have to think of a more efficient algorithm.
The recursion and output arguments are easily identified as the first and second argument, respectively. The base case states that the empty list is already sorted, while the recursive clause states that a non-empty list is sorted by sorting its tail separately:
mySort([],[]).
mySort([Head|Tail],?Sorted):-
/* do something with Head */
mySort(Tail,Sorted).
It remains to decide what the relation is between ?Sorted
, Head
and Sorted
. Obviously, Head
cannot be simply added to the front of Sorted
, but has to be inserted in the proper place. We thus need an auxiliary predicate insert/3
, to add Head
at the proper position in Sorted
. Note that tail recursion is not applicable in this case, since we have to insert Head
in an already sorted list. We thus arrive at the following definition:
mySort([],[]). mySort([Head|Tail],WholeSorted):- mySort(Tail,Sorted), insert(Head,Sorted,WholeSorted).
In order to implement insert/3
, we follow the same steps. The second argument is the recursion argument, and the third is the output argument. This gives the following skeleton:
insert(X,[],?Inserted).
insert(X,[Head|Tail],?Inserted):-
/* do something with Head */
insert(X,Tail,Inserted).
The base case is simple: ?Inserted = [X]
. In the recursive clause, we have to compare X
and Head
. Suppose X
is greater than Head
:
insert(X,[Head|Tail],?Inserted):-
X > Head,
insert(X,Tail,Inserted)
We have to construct the output argument ?Inserted
. Since X
has already been properly inserted to Tail
, it remains to add Head
to the front of Inserted
:
?Inserted = [Head|Inserted]
A third clause is needed if X
is not greater than Head
(note that this clause, being non-recursive, is a second base case):
insert(X,[Head|Tail],?Inserted):-
X =< Head
In this case, X
should be added before Head
:
?Inserted = [X,Head|Tail]
The complete program is given below:
insert(X,[],[X]). insert(X,[Head|Tail],[Head|Inserted]):- X > Head, insert(X,Tail,Inserted). insert(X,[Head|Tail],[X,Head|Tail]):- X =< Head.
Exercise 3.19.
Implement an alternative to this sorting method by using the partition/4
predicate.