University of Reading stands by its appalling Turing Test press release
On June 8, 2014, the University of Reading issued a press release: “Turing Test success marks milestone in computing history“.
This resulted in a massive media frenzy, which was then followed by a backlash (e.g., Techdirt) which argued that the press release had super-hyped the story, thus leading to appalling media coverage. Of course, uncritical and lazy journalists share some of the blame for misleading the public, but in my opinion the University of Reading’s press release started the problem.
Hence, I emailed the university and below you can see our chain of communication.
Charles Heymann (Head of News, University of Reading) has confirmed that he is “happy” for me “to put this correspondence into the public domain”. He also stated that “the Vice-Chancellor … has seen the full correspondence and has nothing further to add.”
There are no fireworks, just a complete brick wall in terms of admitting any errors.
EMAIL 1 from Simon Singh
Dear Sir David [Vice-Chancellor],
I am concerned about recent press coverage of a Turing Test competition. The reporting appears to have been a response to a University of Reading press release. The coverage was often inaccurate, largely because the press release appears to have been exaggerated. I am not sure who is to blame (those who were quoted? those who compiled the press release and perhaps took quotes out of context? those who pushed for a hyped press release?), but the result has been very poor media coverage of an important area of computer research.
In the last 48 hours, many have expressed similar concerns, and the backlash (unfortunately, sometimes personal) to the story has been growing. I am hoping that the press office has learned some lessons about the harm caused by overhyping stories. I would appreciate your reaction to this, and whether or not you acknowledge the problems caused by the Reading press release.
I look forward to hearing from you. Some examples of the backlash and the original press release are below.
Best Regards,
Simon Singh.
Science Journalist & Author of “Fermat’s Last Theorem”, etc.
http://www.neowin.net/news/that-claim-that-a-computer-passed-the-turing-test-was-crap—and-heres-why
http://www.buzzfeed.com/kellyoakes/no-a-computer-did-not-just-pass-the-turing-test
http://www.huffingtonpost.com/2014/06/09/turing-test-eugene-goostman_n_5474457.html
https://www.techdirt.com/articles/20140609/07284327524/no-computer-did-not-pass-turing-test-first-time-everyone-should-know-better.shtml
http://www.reading.ac.uk/news-and-events/releases/PR583836.aspx
REPLY 1 from Dr Huma Shah
Dear Simon,
Thank you for copying me in on an email to Sir David Bell.
The hysteria in the media, with articles/comments from people like Jack Copeland and Stevan Harnad, has emerged from a misunderstanding of what exactly is ‘passing the Turing test’.
I understand one term in the original press release “supercomputer” caused concern: when people tried to chat to Eugene Goostman machine on the web they found that it was not what they imagined – probably robots in the movies have raised our expectations.
As you know, Turing’s ideas on examining machine intelligence was through unrestricted text-based questions-answers put by “average interrogators” (Turing, 1950ǂ) in a “jury panel” (Turing, 1952*). If the media would refer to Turing’s scholarship it might enlighten them.
I’ll try now, because it explains our experiment design.
In 1950, in ǂ‘Computing Machinery and Intelligence’ (Mind journal, Vol. 59, No, 236, pp. 433-460) Turing wrote:
“In order that tones of voice may not help the interrogator the answers should be written, or better still typewritten” (p. 434)
– and
“The question and answer method seems to be suitable for introducing almost any one of the fields of human endeavour that we wish to include” (p, 435)
Turing then gave an example question-answer exchange, in his rebuttal to the Argument from Consciousness, a response to a “very well expressed” opinion of “Professor Jefferson’s Lister Oration in 1949”, between a human interrogator and a ‘witness’ machine in a viva voce (one-to-one) test, Turing said
“if the answers were as satisfactory and sustained as in the above passage I do not think he would describe it as an ‘ easy contrivance ’ ” (p. 446)
About duration for the machine examination by text-based conversation Turing said:
“I believe that in in about fifty years’ time it will be possible to programme machines, with a storage capacity of about 10 to the power of 9 , to make them play the imitation game so well that an average interrogator will not have more than 70 per cent. chance of making the right identification after five minutes of questioning” (Mind journal, p. 442).
The above quote is where the 30% pass rate interpretation comes from.
Our tests were of 5 minutes (first impressions) duration implementing the simultaneous comparison Turing test, that is, each human judge interrogated two hidden entities in parallel at the Royal Society 6-7 June 2014.
After the 150 tests had concluded, the Judges’ scores were checked and verified by an independent adjudication team led by Professor John Barnden of Birmingham University. The judges were unidentifiable to the adjudication team from their score sheets, having event IDs J1 and so on to J30 for the 30 human interrogators. These judges, as well as the 30 hidden humans came from all walks of life: academics, to school pupils, mathematicians to vets, male and female, adults and teenagers, native and non-native English speaker.
Eugene Goostman machine convinced one third of the 30 human judges that it was human: ten of the judges thought they were talking to a human when in fact it was a machine.
This performance by Eugene is not unusual or unexpected, because it convinced 29.17% of a different set of 30 judges that it was human in the Bletchley Park Turing tests held on the 100th anniversary of Alan Turing’s birth, 23 June 2012. And in 2008 Reading University Turing test event, Eugene convinced 25% of the judges that it was human (though it was placed second in that event).
Our Turing test event at the Royal Society was open and free to all and anyone to attend, see the Turing2014 blog for pictures of visitors standing around the two large TV screens displaying conversations from the Judges’ terminals: http://turingtestsin2014.blogspot.co.uk/2014/06/eugene-goostman-machine-convinced-3333.html
The backlash comes from an ignorance of what the Turing test actually is – reading Turing’s scholarship might help, and also different philosophical interpretations of it.
We challenge those who disagree with our Turing tests to stage their own public Turing test experiment and submit their results for peer-review.
I stand by the press release and our Turing test experiment based on my PhD: ‘Deception-detection and machine intelligence in practical Turing tests’.
We will be submitting the results from the Royal Society Turing tests, all the scores and corresponding judge-hidden entities’ conversations for peer-review in the relevant academic journals as we have done previously. One of the last from the 2012 tests can be found in the journal IEEE Transactions on Computational Intelligence and AI in Games: ‘Good Machine Performance in Turing’s Imitation Game’ here:
http://ieeexplore.ieee.org/xpl/articleDetails.jsp?reload=true&arnumber=6609034
Thank you,
Huma
* A.M. Turing. Can Automatic Calculating Machines be said to Think? 1952. In S.B. Cooper & J. van Leeuewn (Eds) Alan Turing: His Life and Impact. Elsevier: Oxford. 2013
—-
Dr. Huma Shah
Co-ordinator 2014 Turing tests at The Royal Society:
– https://www.youtube.com/watch?v=0Hgw9RVwbaw&feature=youtu.be
Contributing author in 2013 RR Hawkins Prize winning book ‘Alan Turing: His Life and Impact’
http://www.proseawards.com/current-winners.html
Robotics, Intelligence, Control and Energy Lab School of Systems Engineering http://www.reading.ac.uk/sse/about/staff/h-shah.aspx
http://reading.academia.edu/HumaShah
Wired: Beyond Asimov: the struggle to develop a legal framework for robots http://www.wired.co.uk/news/archive/2013-02/18/robolaw?page=all
Guardian: Smart robots, driverless cars
http://www.theguardian.com/technology/2013/oct/20/artificial-intelligence-impact-lives#start-of-comments
Member of the Alan Turing Centenary Advisory Committee (TCAC) http://www.mathcomp.leeds.ac.uk/turing2012/
Lead scientist Turing100in 2012 at Bletchley Park & Coorganiser Turing Education Day: http://turing100.blogspot.co.uk/http://www.aboutmyarea.co.uk/Buckinghamshire/Milton-Keynes/MK1/News/Local-News/225953-Turing-Education-Day-at-Bletchey-Park
EMAIL 2 from Simon Singh
Dear Huma,
Thank you for your prompt response, but I think you fundamentally miss or ignore the key point. The media hysteria was a result of a press release that was in my opinion designed to generate media hysteria. I look forward to hearning from Sir David Bell. In the meantime, below are some phrases in the press release that helped generate the hysteria.
Simon.
An historic milestone in artificial intelligence …. Turing Test was passed for the very first time … This historic event … “In the field of Artificial Intelligence there is no more iconic and controversial milestone than the Turing Test… This milestone will go down in history as one of the most exciting.”
REPLY 2 from Charles Hemann (Head of News)
Dear Simon.
Thanks for your email.
I am the Head of News at the University of Reading and the Vice-Chancellor’s office has asked me to respond fully on his behalf.
Professor Warwick and Dr Shah have run three Turing Test experiments and published seven peer-reviewed papers together on the subject, as well as other publications. Their book on the Test will be published by Cambridge University Press later this year: http://www.reading.ac.uk/sse/about/staff/h-shah.aspx. Professor Warwick and Dr Shah will be writing a research paper on this year’s Turing Test event – which they will submit for peer review and publication in due course.
The Turing Test event at the Royal Society was open to the public, with the tests and results independently verified by Professor John Barnden from the University of Birmingham. They set careful and strict parameters in line with their experienced interpretation of the Test – including simultaneous tests; unrestricted conversations; and judges not being told anything about computer programmes/hidden human they were conversing with beforehand. Dr Shah has set out all of this out in detail to you.
In Professor Warwick and Dr Shah’s view, it was the first time that a genuine Turing Test had been “passed”, which they considered a highly significant milestone.
The Turing Test is controversial by definition – there is debate whether it is a genuine test or thought experiment let alone whether there is a “passmark”. Any declaration of it being “passed” is bound to generate debate which Professor Warwick is more than happy to engage in.
The University’s academic reputation is worth far more than chasing publicity for the sake of publicity. This is a principle we apply day-in, day-out in communicating our research responsibly – across a wide range of research areas including climate change, agriculture, meteorology, food nutrition and cardiovascular research.
In this case, the Press Office’s role was to support the Professor Warwick and Dr Shah communicate their research to a wider audience. The press release was written to reflect their assessment of the experiment – my team drafts and signs off every single piece of communications material with our academics. It is not in the University’s interests to put factually inaccurate information in the public domain (and to be clear, we amended the release on Tuesday to describe ‘Eugene’ more accurately).
Both Professor Warwick and Dr Shah, as she wrote to you, stand by the press release. In short, if you disagree with the press release, then you disagree with the Turing Test experiment run by them.
I am somewhat bemused on your wider point of “media hysteria” generated solely by our press release. I have 10 years of national media experience – mainly in Whitehall comms. Journalists are not automatons who regurgitate anything put in front of them.
We made Professor Warwick and Vladimir Veselov (Eugene’s designer) available for interview – both have been interviewed by dozens of print, online and broadcast journalists around the world this week who were able to challenge the findings. In addition – Professor Warwick has blogged for The Independent to respond to some of the debate: http://www.independent.co.uk/biography/kevin-warwick-9528851.html
Throughout, he has been clear that Eugene is a clever but limited computer programme – he has been careful not to talk up its technical prowess.
I am sorry you do not feel we have been accurate. You have a right to your view. But I fear we will have to agree to disagree. As I said, the results of the experiment will be submitted for peer review and published in due course. I am happy to make sure that you have a copy.
Very kind regards,
Charles
EMAIL 3 from Simon
Dear Charles,
Thanks for your reply.
I am sorry, but I am not satisfied with your reply.
1. I would like the VC to reply and confirm that Sir David personally stands by this press release, or at least supports the press office’s judgement in this matter. I am not surprised that you stand by your own press release, but I am more concerned about the view of a senior figure at the university.
2. Your statement “Throughout, he has been clear that Eugene is a clever but limited computer programme – he has been careful not to talk up its technical prowess.” conflicts with the statements in the press release, highlighted in my previous email:
An historic milestone in artificial intelligence …. Turing Test was passed for the very first time … This historic event … “In the field of Artificial Intelligence there is no more iconic and controversial milestone than the Turing Test… This milestone will go down in history as one of the most exciting.”
3. I assume that you have no problem with this correspondence being made public, and that Huma is happy for her reply to be made public.
Regards,
Simon.