What Was Alan Turing’s Imitation Game?

What Was Alan Turing’s Imitation Game? 

Assessing The Theory Behind The Movie

By Professor Drew McDermott (Yale University)

December 31, 2014                                                                 Picture: David Perry/Flickr.


This article is part of the Critique’s exclusive series on the Alan Turing biopic The Imitation Game


I. The Basics of The Imitation Game

Considering the importance Turing’s Imitation Game has assumed in the philosophy-of-mind literature of the last fifty years, it is a pity he was not clearer about what the game was exactly. The principal source for the game’s rules is the paper “Computing Machinery and Intelligence” published in Mind in 1950. Turing proposes the game as an alternative to answering the question, “Can a machine think?”

The “imitation game” … is played with three people, a man (A), a woman (B), and an interrogator (C) who may be of either sex. The interrogator stays in a room apart from the other two. The object of the game for the interrogator is to determine which of the other two is the man and which is the woman. He knows them by labels X and Y, and at the end of the game he says either “X is A and Y is B” or “X is B and Y is A.” The interrogator is allowed to put questions to A and B thus:

C: Will X please tell me the length of his or her hair?

Now suppose X is actually A, then A must answer. It is A’s object in the game to try and cause C to make the wrong identification. …The object of the game for the third player (B) is to help the interrogator. The best strategy for her is probably to give truthful answers….

We now ask the question, “What will happen when a machine takes the part of A in this game?” Will the interrogator decide wrongly as often when the game is played like this as he does when the game is played between a man and a woman? These questions replace our original, “Can machines think?”

Right away there is confusion about what Turing meant by “a machine tak[ing] the part of A.” It sounds as if the machine is to convince the interrogator that it is the woman while B continues to try to convince the interrogator that she is the woman. But it is clear from other sources (especially Braithwaite et al. 1952) that Turing meant for the machine to convince the interrogator it was a person, and for the person to convince the interrogator that he or she is the real human being.

The interrogator’s questions are submitted via “teleprinter,” or “texting” as we would perhaps say today. So is it essential that A and B be able to overhear the questions and each other’s answers? Apparently not, as all subsequent commentators seem to have assumed that each of the two interlocutors received their own stream of questions and sent back their answers over a private channel. Sometimes the element is retained of having the interrogator (or judge, as I’ll sometimes refer to them) be able to direct questions to either interlocutor; although in many interpretations there are several interlocutors and each is interrogated separately. In actual runnings of the Game, such as the Loebner Prize competitions (see below), with multiple interrogators and overlapping interrogations, it is done with one-interlocutor channels to avoid tying up two interlocutors at the same time. Hence the Game has evolved so as to rule out the possibility of one interlocutor interrupting the other, a possibility I suppose Turing was too well bred to have pictured.

In the snippets of Qs-and-As that Turing himself gives as examples, there never seems to be more than one interlocutor. A typical example is,

Q: Do you play chess?

A: Yes.

Q: I have K at my K1, and no other pieces. You have only K at K6 and R at R1. It is your move. What do you play?

A: (After a pause of 15 seconds) R-R8 mate.

To be consistent with the original rules, the Qs should be prefaced by “X, …” or “Y, …..” But it is much easier to focus on the interchange with just one entity, machine or human. In the original free-for-all, A and B can ask each other questions, even if the rules don’t allow it, as in a political debate.

If we simplify the Game so that the interrogator has a one-on-one conversation with each competitor, their goal is to distinguish the human competitor from the machine. The machine (or its designer) wins if the judge gets it wrong; and, of course, by “machine” we mean “program,” using the equivalence Turing himself had proved in the 1930s before the first practical computer had ever been built. It is this version of the Game that is usually called the Turing Test, and I’ll use this terminology interchangeably with “imitation Game.”

 

II. Eugene Goostman and The Nature of The Game

Important questions about the Game, or Test, still remain:

1. How long does the Test last?

2. What are the qualifications of the interrogators?

3. What topics may the questions touch on?

In “Computing Machinery and Intelligence,” the only mention of the duration of the Game is Turing’s statement that

“I believe that in about fifty years’ time it will be possible, to programme computers, with a storage capacity of about 109 [bits], to make them play the imitation game so well that an average interrogator will not have more than 70 per cent chance of making the right identification after five minutes of questioning”. (p. 442)

In June, 2014, a program called “Eugene Goostman,” written by Vladimir Veselov and Eugene Demchenko, supposedly passed the Turing Test, at an event organized by the University of Reading. U.K. It used the criterion of fooling 30% of a set of interrogators, each given five minutes to conduct an interview. The Eugene program fooled 33% of the judges. The organizers of the contest conveniently overlooked the restriction to a 109-bit memory (about 100 Mbyte), although chatbots such as Eugene could probably be made to fit into it if necessary.

The judges included an actor who had played a robot, and a member of the House of Lords who had sponsored a bill pardoning Turing for his “crime” of being gay (BBC 2014).  Which brings me to the question of whether the judges should know anything about the current state of AI research. A couple of the papers in the collection (Epstein et al. 2008) of papers on the Turing Test include lessons learned by recent winners of the Loebner Prize, awarded annually to the best-performing program in a Turing Test with naïve interrogators (lasting 25 minutes). The papers contributed by winners contain a disappointing list of recommended tricks, including “Be zany,” “Try to ask questions, not answer them,” “Keep changing the topic,” and “Give lengthy answers, to run out the clock.” Goostman uses all of them. These tricks work because normal humans sound drab compared to zany, flighty chatbots, and naïve judges tend to equate “drab” with “mechanical,” “zany” with “creative.”

Transcripts with judges and journalists show that they are likely to succumb to the programmers’ wiles, and start answering questions the program asks them, as if the program understands, or cares about, their answers. For a time, you could try talking to the program yourself, at http://www.princetonai.com/bat — and unmask it with two or three hard questions. Unfortunately, the chatbot is no longer taking calls, but you can get the flavor of a serious interrogation by checking out Scott Aaronson’s dialogue with it (http://www.scottaaronson.com/blog/?p=1858). Aaronson is if anything too easy with the machine, although Eugene does badly with almost every question anyway.

Depressingly, in the main other source of Turing’s thinking about the Test, the 1952 radio discussion moderated by the philosopher R.B. Braithwaite, he suggested having a jury of judges, “who should not be experts about machines,” and that “the machine would be permitted all sorts of tricks so as to appear more man-like, such as waiting a bit before giving the answer, or making spelling mistakes” (Braithwaite et al. 1952, p. 495).

But in his Mind paper, Turing seems to have as his standard of intelligence a graduate of Oxford or Cambridge. The other snippet of dialogue he gives is this one, in answer to an objection (number 4; see below) that machines could only echo things typed in by people, “parrot-fashion.”

The game (with the player B omitted) is frequently used in practice under the name of viva voce to discover whether some one really understands something or has “learnt it parrot fashion.” Let us listen in to a part of such a viva voce:

Interrogator: In the first line of your sonnet which reads “Shall I compare thee to a summer’s day,” would not “a spring day” do as well or better?

Witness: It wouldn’t scan.

Interrogator: How about “a winter’s day,” That would scan all right.

Witness: Yes, but nobody wants to be compared to a winter’s day.

Interrogator: Would you say Mr. Pickwick reminded you of Christmas?

Witness: In a way.

Interrogator: Yet Christmas is a winter’s day, and I do not think Mr. Pickwick would mind the comparison.

Witness: I don’t think you’re serious. By a winter’s day one means a typical winter’s day, rather than a special one like Christmas. (p. 446)

Note how Turing discards player B once again. It’s clear that the Imitation Game changes when he needs to make a new point. The purpose of the “game” in this situation is not to decide whether player A is a machine, but to decide what grade it should get in a course. In its original form the game seems to presuppose no such expertise on the part of the computer, but it seems as if the contestant population Turing is drawing on is college-educated people for whom the question “Do you play chess?” has a non-negligible probability of being answered “Yes.”

This brings us to another problem Turing never clarifies. Although he isn’t clear about what the computer is expected to know, he’s even less clear about what it’s expected not to know. Consider questions like these: Where were you born? What’s the earliest war you remember? How did your mother and father meet? Do you live far from here? What made you decide to take part in a running of the Imitation Game? Did you have to travel far to take part? Where are you sitting? Did you vote in the last election for national office?

The weird thing about these questions is that they require us to equip the computer with a fake backstory, as if it’s an undercover agent. Yet Turing never mentions having to deal with this now-obvious possibility. Many contestants, like Veselov and Demchenko, have indeed equipped their programs with backstories, such as Goostman’s claim to be a 13-year-old boy from Ukraine. Regrettably, the judges rarely probe very deeply into Goostman’s backstory; it’s just as flimsy as the rest of the illusion created by the program’s seemingly weird personality.

Sometimes people running Turing Tests try to rule out “personal” questions, but it’s difficult to see how this can be done. Suppose the judge says, “I’m a computer. How many computers are taking part in this conversation?” The machine should either answer “One,” or express doubt that the interrogator is a computer. Perhaps all questions should pass through a “censor” who would detect a question requiring the computer to cough up information about “itself,” the fictional human being. Training the censors could be difficult. It’s apparently legitimate to ask questions such as, “Are you interested in football?” or “When you talk of football, which kind do you mean?” But we would like to rule out, “Did you ever play football for your school?” or “We’re in Kentucky. How can you not mean `American football’?” Conversations about general events often get into personal questions, and if the programs are allowed to ask questions about personal backgrounds, which they do all the time, they should have to answer them.

Again, Turing’s oracular pronouncements on such questions are often hard to make sense of. In the same radio discussion cited earlier, when asked whether machines could or would throw tantrums, he said:

“I don’t envisage teaching the machine to throw temperamental scenes. I think some such effects are likely to occur as a sort of by-product of genuine teaching, and that one will be more interested in curbing such displays that in encouraging them. Such effects would probably be distinctly different from the corresponding human ones, but recognisable as variations on them. This means that if the machine was being put through one of my imitation tests, it would have to do quite a bit of acting, but if one was comparing it with a man in a less strict sort of way the resemblance might be quite impressive”. (Braithwaite et al. 1952, p. 503)

Everyone involved in this radio discussion was confused, understandably, about the potential of digital computers. Turing seems clearly to have overestimated the difficulty of getting a computer to do the trivial tricks he alludes to, and at the same time underestimated the difficulty of getting a machine to really sound like a person. If “temperamental scenes” were a natural and expected result of teaching, but “distinctly different from … human ones,” then why have a test requiring a machine to seem human? And what could he possibly have meant by “acting” here? Acting is a difficult skill that few humans can master; did Turing really believe that a practical test for intelligence could require a machine to master it? And why would an intelligent machine want to master it? Given the textual medium he had in mind, I don’t think he meant “acting” in the normal sense, nor did he really intend to credit the machine with that ability. I think all he meant was that the programmer would have to engage in a lot of vicarious prevarication. And the last bit about how the results would be “quite impressive” if “one was comparing [the computer] with a man in a less strict sort of way” is impossible to interpret.

Turing was a brilliant thinker, but an average writer. His papers read like lists of thoughts that were put down in the order they occurred to him, and never revised. In “Computing Machinery and Intelligence,” he presents the Game and answers a couple of critiques of it as a test of intelligence, then (section 3 to 5) digresses to explain digital computers, universality, and programming. This digression was necessary at a time when most people, including readers of Mind, had little concept of what computers were. At the end of section 5 he proposes that the question “Can machines think?” should be replaced by “Can a big, fast general-purpose computer be programmed to play the Imitation Game?”

Let us fix our attention on one particular digital computer C. Is it true that by modifying this computer to have an adequate storage, suitably increasing its speed of action, and providing it with an appropriate programme, C can be made to play satisfactorily the part of A in the imitation game, the part of B being taken by a man? (p. 442)

Of course, nowadays we take for granted that if the machine isn’t big enough and fast enough, wait 18 months and there will a be bigger, faster, lighter one on the market.

 

III. Objections

Section 6, the most entertaining, consists of various objections to the idea of machine intelligence, and Turing’s reply. But of course the objections, identified with various real and hypothetical opponents, are rarely concerned with the refined version of Turing’s proposal as quoted just now, because none of these opponents had heard it before. However, some had heard precursors of Turing’s idea. For example, Geoffrey Jefferson, who is quoted at the beginning of the Objection from Consciousness (number 4) titled his Lister Oration of 1949 “The Mind of Mechanical Man” (Jefferson 1949). It is obviously the work of an acquaintance of the men who built the Manchester computer that Turing was an early user of. (Jefferson was an insightful participant in the radio discussion described above (Braithwaite 1952), as was Max Newman, the man who hired Turing.) Jefferson disparaged machines that engaged in “artificially signalling” with messages as “an easy contrivance.” Turing counters with the hypothetical viva voce involving sonnets quoted above. His argument to the objection, after some insightful observations, descends into regrettable flippancy about whether to grant other beings consciousness as a courtesy; but here our question is what all this has to do with the Imitation Game, and the answer is, not much.

Objections 1 and 2 are the argument from religion and the argument from fear of the unknown. The response to neither involves the Imitation Game.

One of the most resilient objections is number 3, the mathematical objection, based on Turing’s own work (and others’) on uncompatibility. For every computer program that answers Yes/No questions (drawn from a class complex enough to include Peano arithmetic), there is a class of questions it can’t answer, and among this class is one equivalent to, “If I asked you this question, would you answer No?” It can’t answer, and hence the correct answer is No. This proves a limitation of computer programs that apparently we don’t share, since we can draw the correct conclusion and the program can’t. Turing has several replies to this objection (here and his earlier papers on AI, Turing 1947, 1948), but all he says about the Imitation Game is: “Those who hold to the mathematical argument would, I think, mostly be willing to accept the imitation game as a basis for discussion” (p. 445). In fact, objectors such as Roger Penrose, who has defended the mathematical objection in two books (1989, 1994), have not accepted the imitation game or any other way of thinking about AI.

Objection 5 is a list of things machines will never do, such as fall in love or enjoy strawberries and cream, or make mistakes. Many of these tie into the argument from consciousness, as Turing points out. But his reply is vague and desultory, focused on issues like whether computers can make mistakes. In some senses, no, and in some yes, as every programmer knows.

Objection 6 (“Lady Lovelace’s) is that “The Analytical Engine has no pretensions to originate anything. It can do whatever we know how to order it to perform.” It is now a commonplace that one thing we know how to order a computer to perform is learn enough for its behavior to change dramatically, under certain conditions. Unfortunately, Turing seemed to have hoped that it was possible to bootstrap from a few basic rote-learning strategies into learning to learn faster. Most of the results in the theory of machine learning tend to be refutations of this kind of idea. But in Turing’s defense, almost everyone overestimated the power of learning in those days.

By now Turing has drifted far from the Imitation Game. But he does return to it in connection with the next objection, number 7, the argument from continuity in the nervous system. “It is true that a discrete-state machine must be different from a continuous machine. But if we adhere to the conditions of the imitation game, the interrogator will not be able to take any advantage of this difference” (p. 451). Why not? He offers an analogy: Suppose we wanted to make a digital computer “pretend” to be a differential analyzer (a kind of analogue computer). It could use random or pseudo-random numbers to introduce wobble into the answers it prints out, and no one would be the wiser.

One sees the analogy, but it’s a weak one. Those who find “electronic brains” to be totally unbrainlike presumably do so because they believe that inside the brain, among all those trillions of synapses, and billions of glial cells, and axons, phenomena occur that would be very hard to simulate digitally on the scale required to achieve, say, creativity. I don’t find this objection any more convincing than Turing did, but I have to admit I have no argument. In spite of impressive advances in neuroscience, we are as far from answering many basic questions about how the brain works as we were in 1950.

Besides, what does the differential-analyzer imitation game have to do with the original Game, exactly? Turing may have thought that randomness was necessary to avoid falling into repeated behavior patterns, but it now seems obvious that the reason most people avoid repeating themselves is memory. For instance, one remembers having been directed to help desk A from help desk B, so after being sent to desk A again one does not just start all over from square one, at least not without protest. (Perhaps one way for the interrogator to unmask the machine is to make demands that will infuriate a real person, hoping the program will be unnaturally patient.)

The objection (if you’re counting, number 8) from informality of behavior is that people don’t follow rules to decide what to do. This objection is based on a simple equivocation: the sense in which computers follow rules is not the same as the sense in which people do (when they do), as Turing points out.

The last objection, number 9, is that people may be capable of extrasensory perception. Turing takes this objection surprisingly seriously, and ends up recommending figuring out how to build a “telepathy-proof room” to house the contenders in the Imitation Game. We pause in wonder, and move on.

That concludes section 6 of the paper. After this comes one more section, a longish discussion of machine learning, but the Imitation Game, or the Turing Test as it is now usually called, is not mentioned again.

 

IV. Conclusion

Given the general fuzziness of Turing’s description of the Imitation Game, its lack of importance in the history of the field, and uncertainty about how much importance he attached to it, one wonders why it has circulated so virally for so long. I think there are a couple of reasons. One is that there is no obvious sufficient condition for us to label a machine as intelligent, or as capable of thought. Stevan Harnad’s (2000) “Total Turing Test,” satisfied only by a mechanical person that fools people into thinking it’s human (a Terminator II, in other words, but non-homicidal), is hard to set rules for. For Turing’s Test, we have to decide how savvy the judges are and how long they get to talk to the machine, and we’re done. Plus Harnad’s test is almost by definition sufficient, whereas you can spend an enjoyable evening over a couple of beers debating whether Turing’s version is sufficient, or there’s a way to cheat (McDermott 2014).

But my guess is that the most important reason for the hold the Turing Test has on the imagination of so many is that Turing died young, under mysterious and infuriating circumstances, having been persecuted (and prosecuted) for what was then considered deviant sexual behavior. Only after his death was the magnitude of his achievements realized. “Computing Machinery and Intelligence” is not one of his strongest works, but it is one of the most accessible, and readers groping for its significance fastened onto the Imitation Game as its one solid contribution. On the basis of this paper, he was anointed the patron saint of AI and the Turing Test was enshrined as one of its central ideas. The sad truth is that Turing, only 42 when he died, could have been one of the founding fathers of AI, but missed the founding by a few years. His influence on the field when it took off was small, whereas his influence on computer science in general is incalculable. The Imitation Game is basically a fun thought experiment and not much more. But it will be around until AI is seen as having definitely succeeded or failed, so we might as well enjoy whatever conversations it gets us into.


Footnotes & References

[1] BBC 2014 “Computer AI passes Turing test in ‘world first’.” http://www.bbc.com/news/technology-27762088

[2] R.B. Braithwaite, G. Jefferson, Max Newman, and Alan Turing 1952 Can Automatic Machines Be Said To Think? (BBC Radio broadcast.) Also in (Copeland 2004), pp.~494–506

[3] B. Jack Copeland (ed.) 2004 The Essential Turing: Seminal Writings in Computing, Logic, Philosophy, Artificial Intelligence, and Artificial Life, plus The Secrets of Enigma. Oxford: Clarendon Press

[4] Robert Epstein, Gary Roberts, and Grace Beber 2008 Parsing the Turing Test: Philosophical and Methodological Issues in the Quest for the Thinking Computer. Springer

[5] Stevan Harnad 2000 Minds, machines, and Turing. J. of Logic, Language and Information 9(4), pp. 425–45

[6] Geoffrey Jefferson 1949 The mind of mechanical man. Brit. Med. J. 1(4616), pp. 1105–1110

[7] Drew McDermott 2014 On the claim that a table-lookup program could pass the Turing test. Minds and Machines 24(2), pp. 143–188

[8] Roger Penrose 1989 The Emperor’s New Mind: Concerning Computers, Minds, and the Laws of Physics. New York: Oxford University Press

[9] Roger Penrose 1994 Shadows of the Mind: A Search for the Missing Science of Consciousness. New York: Oxford University Press

[10] Alan Turing 1947 Lecture to the London Mathematical Society. Typescript in the King’s College Archives titled “Lecture to L.M.S., Feb. 20, 1947.” In (Copeland 2004), pp. 378–394

[11] Alan Turing 1948 “Intelligent machinery.” Typescript in King’s College Archives. (Digital facsimile at URL www.turingarchive).

Drew McDermott
Drew McDermott
Professor Drew McDermott has done work in several areas of artificial intelligence. One of his perennial interests is in planning algorithms, which calculate structures of actions for autonomous agents of various sorts. He did seminal work in the area of “hierarchical planning” in the 1970s. In the last decade, his focus has switched to regression-based techniques for classical planning, especially methods that heuristically search through situation space. He was instrumental in starting the biannual series of AI Planning Systems (AIPS) conferences. In 1998, he ran the first ever Planning Competition in conjunction with AIPS; it has now become a standard part of the ICAPS conference, the merger of AIPS and the European Conference on Planning (ECP). Another enduring interest of Prof. McDermott’s is in the area of knowledge representation (KR), which is the attempt to formalize what people know in a form usable by a computer. He wrote some influential papers on nonmonotonic logic and representation of temporal knowledge. However, in the mid-1980s he became convinced that the KR project, in its more ambitious formulations, was ill-defined. He published a paper titled “A Critique of Pure Reason” making this case. However, Prof. McDermott is now thinking about KR issues again, this time in conjunction with the problem of “metadata” on the world-wide web, which will tell automated agents what the content and capability of a web resource is. Hopefully solving this problem will not require tackling the original KR problem, which he still believes to be hopeless. Professor McDermott 's published work includes Mind and Mechanism (MIT Press, 2001).
Recent Posts
Contact Us

We're not around right now. But you can send us an email and we'll get back to you, asap.

Not readable? Change text. captcha txt

Start typing and press Enter to search