91.304 Foundations of (Theoretical) Computer Science : 1 91.304 Foundations of (Theoretical) Computer Science Chapter 3 Lecture Notes
David Martin
dm@cs.uml.edu This work is licensed under the Creative Commons Attribution-ShareAlike License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/2.0/ or send a letter to Creative Commons, 559 Nathan Abbott Way, Stanford, California 94305, USA.
Turing machine syntax : 2 Turing machine syntax Definition A Turing Machine is an automaton M=(Q,,,,q0,qacc,qrej) where
Q is a finite set of states
is an input alphabet that does not include " t ", the special blank character
is a tape alphabet satisfying
t 2
µ
:Q£! Q££{L,R} is the transition function
q0 is the initial state
qacc is the single accepting state
qrej is the single rejecting state
Differences in input mechanism : 3 Differences in input mechanism A TM has a "tape head" that points to exactly one cell on its tape, which extends infinitely to the right
At each transition, the TM looks at the current state and the current cell, and decides what new state to move to, what to write on the current cell, and whether to move one cell to the left or one cell to the right
Hence the transition function :Q£! Q££{L,R}
Each tape cell initially contains the blank character t
Our previous automata (DFAs, NFAs, PDAs) all had a separate read-only input stream
But in a TM, the input is given all at once and just written onto the left end of the tape — overwriting the blanks there a | b | a | b | a | b | t | t | t | t | t | t | t | in state q7
Turing machine computation : 4 Turing machine computation As with PDAs, we define a set of instantaneous descriptions (IDs) and then show what memory-state snapshots may follow each other, according to the program M.
First, the snapshots: in a TM, ID(M) = * Q *
Each element of this set represents the entire tape contents, the current state, and the location of the tape head
In example below, the ID is ab q7a babtt
So the character to the right of the state name is the "current" character
The tape always has infinitely many blanks on the right; we can write them or omit them as we please a | b | a | b | a | b | t | t | t | t | t | t | t | in state q7
Turing machine computation : 5 Turing machine computation Two IDs are related to each other (by `) if one can lead to the other according to the function
So we look at all of the things that can say, starting with right moves:
Suppose (q,b) = (t,c,R) where
R means "right move"
q 2 Q - {qacc, qrej} and b 2 (states in green)
t 2 Q and c 2
Then u qb v ` u ct v
where u,v2* are undisturbed, the state has changed from q to t, the tape cell has changed from b to c, and the head has moved one character to the right (over the now-changed character)
Turing machine computation : 6 Turing machine computation Left moves
Suppose (q,b) = (t,c,L) where
q 2 Q - {qacc, qrej} and b 2 (states in green)
t 2 Q and c 2
Then ua qb v ` u tac v
where u,v2* and a2 are undisturbed, the state has changed from q to t, the tape cell has changed from b to c, and the head has moved one character to the left
This says that one ID can lead to another ID when says to move left and there is a character a2 to the left. What if there is no such character?
Turing machine computation : 7 Turing machine computation Left moves at left edge of tape
Suppose (q,b) = (t,c,L) where
q 2 Q - {qacc, qrej} and b 2 (states in green)
t 2 Q and c 2
Then qb v ` tc v
where v2* is undisturbed, the state has changed from q to t, the tape cell has changed from b to c
Where does this put the tape head in this case?
Note we have not explicitly covered the case where (q,b) = (t,c,L) and q2{qacc,qrej}
Or when we move R instead of L
Conclusion: well, if the current ID is u qb vand q2{qacc,qrej}, then no "next ID" is possible. We say that the TM halts
Language recognized by TM : 8 Language recognized by TM Finally, we let `* be the transitive, reflexive closure of `. So if and are IDs, the statement `* means "the TM can go from to in 0 or more steps"
The language recognized by M isL(M) = { x2* | q0 x `* u qacc v for some u,v2* }
Translation?
Note x 2 *, not *
TM language classes : 9 TM language classes Definition A language L is Turing-recognizable if there exists a TM M such that L = L(M).
Synonym: L is recursively enumerable, abbreviated "r.e."
Definition The class of all Turing-recognizable languages is 1 = { L µ * | L is Turing-recognizable }
The textbook does not assign a name like this; it just says "class of TM-recognizable langs"
Beware: The class 1 is not an alphabet like
The naming is unfortunate but better than some of the alternatives
Deciders : 10 Deciders We've seen that when you start a TM with an input x, it can do three distinct things:
Accept x
Reject x
Run forever without accepting or rejecting x
We call this "looping" -- meaning that the TM runs forever. (The "loop" might not be so simple, the point is it runs forever.)
Some TMs always accept or reject and never loop on any input whatsoever. You could easily write an example of one. A TM with this property is called a decider.
A decider always halts on every input
Decidable languages : 11 Decidable languages Definition A language is decidable if there exists a decider TM M such that L = L(M)
Synonyms: L is "computable" and "recursive"
It is in general not easy to tell if a language is decidable or not
Definition The class of all Turing-decidable languages is 0 = { L µ * | L is Turing-decidable}
Note 0 (decidable) versus 1 (recognizable) versus (alphabet)
Question: was the language from last time decidable or not?
Decidable versus recognizable : 12 Decidable versus recognizable Fact (obvious) 0 µ 1
Every decider is automatically a recognizer too
Fact (not at all obvious) 0 1
This means that there exists some language H 2 1 - 0
H is a language that can be recognized by some TM, but can't be recognized by any TM that always halts!
Fact (not at all obvious) 1 ALL
This means that there exists some language H2 2 ALL - 1
H2 is a language that can't even be recognized by any TM
Ultimately : 13 Ultimately ALL FIN Each point is a language in this Venn diagram REG RPP CFL CFPP 0 1
Slide 14 :
Slide 15 :
Missing Content : 16 Missing Content Variants of the TM that are equivalent to the “standard” (Definition 3.5, 1) model:
“Equivalent” means “There exists some machine in the alternate model that recognizes the same language as the standard model”
Multi-track machines
Can enlarge the tape alphabet to keep track of multiple things simultaneously
Multi-head machines
Can read from multiple independent positions at once, write to them, and then move them independently
Our technique for converting one of these into the "real" model was slow at run time but did the right thing
Multi-tape machines
A combination of multi-track and multi-head, proven equivalent in textbook's Theorem 3.13
Nondeterministic TMs
Proven equivalent in Theorem 3.16
We didn't fully define these models..
For instance, we didn't explicitly state how the ` relation works
Another model : 17 Another model Definition An enumerator E is a 2-tape TM with a special state named qp ("print")
The language generated by E isL(E) = { x2* | (q0 t, q0 t ) `* ( u qp v, x qp z ) for some u, v, z 2 * }
Here the instantaneous description is split into two parts (tape1, tape2)
So this says that "x appears to the left of the tape 2 head when E enters the qp state"
Note that E always starts with a blank tape and potentially runs forever
Basically, E generates the language consisting of all the strings it decides to print
And it doesn't matter what's on tape 1 when E prints
Reminder : 18 Reminder The decidable languages: 0
The recognizable languages: 1
Theorem 3.21 : 19 Theorem 3.21 L 2 1 , L=L(E) for some enumerator E (in other words, enumerators are equivalent to TMs)
Proof First we show that L=L(E) ) L21. So assume that L=L(E); we need to produce a TM M such that L=L(M). We define M as a 3-tape TM that works like this: input w (on tape #1)
run E on M's tapes #2 and #3
whenever E prints out a string x, compare x to w; if they are equal, then acceptelse goto 2 and continue running E
Theorem 3.21 continued : 20 Theorem 3.21 continued Now we show that L21 ) L=L(E) for some enumerator E. So assume that L=L(M) for some TM M; we need to produce an enumerator E such that L=L(E). First let s1, s2, be the lexicographical enumeration of *. E behaves as follows: for i:=1 to 1
run M on input si
if M accepts si then print string si(else continue with next i) DOES NOT WORK!!
Theorem 3.21 continued : 21 Theorem 3.21 continued Now we show that L21 ) L=L(E) for some enumerator E. So assume that L=L(M) for some TM M; we need to produce an enumerator E such that L=L(E). First let s1, s2, be the lexicographical enumeration of *. E behaves as follows: for t:=1 to 1 /* t = time to allow */
for j:=1 to t /* continue resumes here */
compute the instantaneous description uqv in M such that q0 sj `t uqv. (If M halts before t steps, then continue)
if q = qacc then print string sj(else continue)` exactly t steps of the ` relation
Theorem 3.21 continued : 22 Theorem 3.21 continued First, E never prints out a string sj that is not accepted by M
Suppose that q0 s5 `27 u qacc v (in other words, M accepts s5 after exactly 27 steps)
Then E prints out s5 in iteration t=27, j=5
Since every string sj that is accepted by M is accepted in some number of steps tj, E will print out sj in iteration t=tj and in no other iteration
This is a slightly different construction than the textbook, which prints out each accepted string sj infinitely many times
Closure properties of 0 and 1 : 23 Closure properties of 0 and 1 1 is closed under [, Å, ¢, ¤, reversal
Proofs for [ and Å are similar to the NFA constructions we used, if you use a 2-tape TM
Proof for reversal is also easy with a 2-tape TM
¢ and ¤ are somewhat harder
0 is closed under all of these operations and complement as well
0 closed under complement : 24 0 closed under complement Proof Suppose L20. Then L=L(M1) for some decider M1. We want to show there's a decider M2 such that L(M2) = Lc, whence (!) Lc 2 0.
This is easy: we just set M2 to be M1 with qacc and qrej swapped. This works because M1 never loops (forever) on any input; it always reaches either qacc or qrej, and so M2 does the opposite thing from M1 in terms of accepting or rejecting the string.
Proof does not work for 1 : 25 Proof does not work for 1 Without the guarantee that M1 never loops on any input, this conversion does not produce an M2 such that L(M2) = L(M1) c.
Because if x L(M1), we want x2 L(M2), but...
Preview: 1 is not closed under complement : 26 Preview: 1 is not closed under complement The previous discussion just showed that the attempted proof of closure under complement failed
It didn't show that 1 is not closed under complement
However it is in fact true that 1 is not closed under complement; will see an actual proof later
Preview: a non-recognizable L : 27 Preview: a non-recognizable L This all means that some L exists that is not recognized by any TM
What does it look like?
Is it important? YES, because of Church-Turing Thesis
Historical background : 28 Historical background Great concern in the early 20th century about the notion of effective computability.
When someone defines what they want a function to mean, is it a reasonable definition? Only if it’s "effectively computable"
Articulated by Hilbert in 1900 at the International Congress of Mathematics in Paris
Note that it's easy to use a TM to define a function f:* ! *; you just take the output to be whatever's on the tape when the machine halts
Alonzo Church (with Stephen Kleene) 1932-34, pursued effective computability via -definability ("lambda")
Was a proposal for a new foundation for writing logic and mathematics. Ultimately didn’t resonate well enough to be widely adopted for that purpose, but certainly plays an important role even today
Example lambda notation function: x. xy
Specifies "-conversion" (renaming of bound variables; parameter-argument matching) and "-conversion" (evaluation by substitution; running the function)
Given a -function f and input n, there is a deterministic procedure for working towards a “normal form” – namely, the output of the function
Church's Thesis : 29 Church's Thesis In 1934 Church claimed that “the -definable functions are all the effectively computable functions”
Kleene (his student) called it Church’s Thesis
Gödel and Herbrand ’34, developed a notion of effective computability based on provability in formal arithmetic system
Previously, in ’31, Gödel had proved that no sufficiently sophisticated/interesting formal system could be both consistent and complete. Still celebrated as a major breakthrough in mathematics.
Gödel and Herbrand started with a formal system of arithmetic (axioms & rules for proof).
And said that a function is effectively computable if for each f(m)=n we can prove that f(m)=n within the system.
Gödel did not at first accept Church’s thesis.
Kleene's partial recursive functions : 30 Kleene's partial recursive functions Kleene defined another important formalism in ’36:
A function is partial recursive if it can be generated from 0, projection, successor, substitutions, primitive recursion, and minimization.
Kleene in ’36 proved that these partial recursive functions are equivalent to the -definable functions in terms of what functions can be expressed
In other words, a function in one model could be converted into a function in the other model, just like we’ve done on a smaller scale with finite automata variants and TM variants
However, these forms of "effective computability" are all very abstract mathematical notions
Turing's 1936 paper : 31 Turing's 1936 paper Alan Turing was basically unaware of all of this work when he wrote his paper "On Computable Numbers with an application to the Entscheidungsproblem."
Entscheidungsproblem: One of 23 famous problems posed by Hilbert in 1900 at the International Congress of Mathematics
Namely: find a mechanical procedure that determines the truth or falsehood of an arbitrary, carefully-specified, mathematical assertion.
Turing proved that there was no procedure to solve the Entscheidungsproblem
We’ll see this proof, more or less
Along the way he defined his notion of computation and described Universal Turing Machines (which we’ll also see)
Turing's 1936 paper: excerpt : 32 Turing's 1936 paper: excerpt Alan. M.Turing, "On Computable Numbers, With an Application to the Entscheidungsproblem,'' Proc. London Math. Soc., 2(42) (1936), 230-265; "A correction'' ibid, 43, 544-546. Google search 9. The extent of the computable numbers.
No attempt has yet been made to show that the “computable” numbers include all numbers which would naturally be regarded as computable. All arguments which can be given are bound to be, fundamentally, appeals to intuition, and for this reason rather unsatisfactory mathematically. The real question at issue is “What are the possible processes which can be carried out in computing a number?”
Turing's 1936 paper: excerpt : 33 Turing's 1936 paper: excerpt The arguments which I shall use are of three kinds.
A direct appeal to intuition.
A proof of the equivalence of two definitions (in case the new definition has a greater intuitive appeal).
Giving examples of large classes of numbers which are computable.
Computing is normally done by writing certain symbols on paper. We may suppose this paper is divided into squares like a child's arithmetic book. In elementary arithmetic the two-dimensional character of the paper is sometimes used. But such a use is always avoidable, and I think that it will be agreed that the two-dimensional character of paper is no essential of computation.
I assume then that the computation is carried out on one-dimensional paper, i.e. on a tape divided into squares. I shall also suppose that the number of symbols which may be printed is finite. If we were to allow an infinity of symbols, then there would be symbols differing to an arbitrarily small extent. The effect of this restriction of the number of symbols is not very serious. It is always possible to use sequences of symbols in the place of single symbols. Thus an Arabic numeral such as 17 or 999999999999999 is normally treated as a single symbol.
Turing's 1936 paper: excerpt : 34 Turing's 1936 paper: excerpt Similarly in any European language words are treated as single symbols (Chinese, however, attempts to have an enumerable infinity of symbols). The differences from our point of view between the single and compound symbols is that the compound symbols, if they are too lengthy, cannot be observed at one glance.
Slide 35 : 35 “Manners are not taught in lessons,” said Alice. “Lessons teach you to do sums, and things of that sort.”
“And you do Addition?” the White Queen asked. “What's one and one and one and one and one and one and one and one and one and one?”
“I don't know,” said Alice. “I lost count.”
“She can't do Addition,” the Red Queen interrupted. Excerpt: Through the Looking Glass, Lewis Carroll
Turing's 1936 paper: excerpt : 36 Turing's 1936 paper: excerpt This is in accordance with experience. We cannot tell at a glance whether 9999999999999999 and 999999999999999 are the same.
The behaviour of the computer at any moment is determined by the symbols which he is observing. and his “state of mind” at that moment. We may suppose that there is a bound B to the number of symbols or squares which the computer can observe at one moment. If he wishes to observe more, he must use successive observations. We will also suppose that the number of states of mind which need be taken into account is finite. The reasons for this are of the same character as those which restrict the number of symbols. If we admitted an infinity of states of mind, some of them will be “arbitrarily close” and will be confused. Again, the restriction is not one which seriously affects computation, since the use of more complicated states of mind can be avoided by writing more symbols on the tape.
Turing's 1936 paper: excerpt : 37 Turing's 1936 paper: excerpt Let us imagine the operations performed by the computer to be split up into “simple operations” which are so elementary that it is not easy to imagine them further divided. Every such operation consists of some change of the physical system consisting of the computer and his tape. We know the state of the system if we know the sequence of symbols on the tape, which of these are observed by the computer (possibly with a special order), and the state of mind of the computer.
We may suppose that in a simple operation not more than one symbol is altered. Any other changes can be set up into simple changes of this kind. The situation in regard to the squares whose symbols may be altered in this way is the same as in regard to the observed squares. We may, therefore, without loss of generality, assume that the squares whose symbols are changed are always “observed” squares.
Besides these changes of symbols, the simple operations must include changes of distribution of observed squares. The new observed squares must be immediately recognisable by the computer. I think it is reasonable to suppose that they can only be squares whose distance from the closest of the immediately previously observed squares does not exceed a certain fixed amount. Let us say that each of the new observed squares is within L squares of an immediately previously observed square.
Turing's 1936 paper: excerpt : 38 Turing's 1936 paper: excerpt The simple operations must therefore include:
(a) Changes of the symbol on one of the observed squares.
(b) Changes of one of the squares observed to another square within L squares of one of the previously observed squares.
It may be that some of these changes necessarily involve a change of state of mind. The most general single operation must therefore be taken to be one of the following:
A. A possible change (a) of symbol together with a possible change of state of mind.
B. A possible change (b) of observed squares, together with a possible change of state of mind.
Turing's 1936 paper: excerpt : 39 Turing's 1936 paper: excerpt We may now construct a machine to do the work of this computer.
To each state of mind of the computer corresponds an “m-configuration” of the machine. The machine scans B squares corresponding to the B squares observed by the computer. In any move the machine can change a symbol on a scanned square or can change anyone of the scanned squares to another square distant not more than L squares from one of the other scanned squares. The move which is done, and the succeeding configuration, are determined by the scanned symbol and the m-configuration.
Turing's 1936 paper: excerpt : 40 Turing's 1936 paper: excerpt We suppose, as in I, that the computation is carried out on a tape; but we avoid introducing the “state of mind” by considering a more physical and definite counterpart of it.
It is always possible for the computer to break off from his work, to go away and forget all about it, and later to come back and go on with it. If he does this he must leave a note of instructions (written in some standard form) explaining how the work is to be continued. This note is the counterpart of the “state of mind”.
We will suppose that the computer works by such a desultory manner that he never does more than one step at a sitting. The note of instructions must enable him to carry out one step and write the next note. Thus the state of progress of the computation at any stage is completely determined by the note of instructions and the symbols on the tape.
Turing's 1936 paper: excerpt : 41 Turing's 1936 paper: excerpt That is, the state of the system may be described by a single expression (sequence of symbols), consisting of the symbols on the tape followed by A (which we suppose not to appear elsewhere) and then by the note of instructions. This expression may be called the “state formula”. We know that the state formula at any given stage is determined by the state formula before the last step was made, and we assume that the relation of these two formulae is expressible in the functional calculus.
In other words we assume that there is an axiom U which expresses the rules governing the behaviour of the computer, in terms of the relation of the state formula at any stage to the state formula at the proceeding stage. If this is so, we can construct a machine to write down the successive state formulae, and hence to compute the required number.
Turing's 1936 paper : 42 Turing's 1936 paper So Turing’s idea of effective computation was a description of what humans could do.
He submitted his paper in May, and in August submitted an appendix proving that Church’s “effective computability” (-definable) was equivalent to his own (TM-computable).
After seeing Turing's proof, Gödel became convinced that they were all talking about the same thing.
Equivalence between these models then became known as the Church-Turing thesis.
The Church-Turing Thesis : 43 The Church-Turing Thesis Any algorithmic-functional procedure that can be done at all can be done by a Turing machine
This isn't provable, because “algorithmic-functional procedure” is vague. But this thesis (law) has not been in serious doubt for many decades now
TMs are probably the most commonly used low-level formalism for functional algorithms and computation
Commonly used high-level formalisms include pseudocode and all actual programming languages. By Church-Turing thesis, these are all equivalent in terms of what they can (eventually) do.
Of course they have different ease-of-programming and time/memory efficiency characteristics.
Alan Turing: biographical notes : 44 Alan Turing: biographical notes Turing was 24 at the time he published his 1936 paper
1939, Turing was working at Bletchley Park on decrypting Enigma, the encryption device used by Germany in WWII.
The existence of the Collosus decryptor was kept secret through most of WWII
This work basically made him a war hero
1939, John Vincent Atanasoff and Clifford Berry designed and built the early ABC computer at Iowa State.
The first electronic digital computer
Personnel disruptions due to the war caused the patent application to be lost before it was submitted.
1941, Mauchly and Eckert visited ISU and went on to built ENIAC in ‘46
The first general-purpose electronic digital computer
Patent fight 1967 resolved in 1973, invalidating ENIAC patent
Followed next day by “Saturday Night Massacre”
Alan Turing: biographical notes : 45 Alan Turing: biographical notes 1945 (National Physics Lab), 1948- (Computing Machine Laboratory) Turing continued to worked on computing, artificial intelligence, and later, biology.
"Turing test" for successful artificial intelligence
ACM Turing award — most prestigious award in computer science
1952 Turing was arrested after reporting burglary
Offered a choice between prison and "organo-therapy" (hormone treatments) to "cure" his "deviance". Shrugged and went along with that, an interesting science experiment...
Turing seems to me a prototype of the modern western gay person (at a time before the word "gay" meant that): fully integrated into mainstream society yet thoroughly oppressed by it. By all accounts, he had a positive outlook on life
1954 Turing was found dead at age of 42 with an apparently poisoned apple partially eaten at his bedside. (He kept cyanide in the house for various experiments.) Believed to be suicide but not fully investigated