
If C decides “wrongly as often as does when the game is played between a man and a woman”, Footnote 8 then M passes the test. The second phase, introduced by the question “What will happen when a machine takes the part of A in this game?”, Footnote 7 is played in the same way by M (machine), B (woman) and C (the judge). The first phase is played by A (man), B (woman) and C (the judge): here, C asks questions to A and B in order to identify the woman. The Original Imitation Game (OIG) is based on the first formulation of the test given by Turing ( 1950), and it involves two phases. I advocate the Literal Interpretation as the proper one, and I use the experimental design of the OIG as the experimental design of the QTT. The former holds that the results of the TT are given by the comparison between the human’s performance and the machine’s performance and the latter holds that the results are given directly by the judge’s decision, with no benchmark or comparison needed. In this section, I review two different interpretations of the TT: (i) the Literal Interpretation, endorsed by the Original Imitation Game (Sterrett 2000) and (ii) the Standard Interpretation, endorsed by the Standard Turing Test (Moor 2001). 5, I consider four possible objections to the QTT. 4, I introduce the QTT, describe my study, and show the results gained. 3, I discuss two problems with the TT: (i) Artificial Stupidity and (ii) Blockhead. In the next section, I review two interpretations of the TT: the Original Imitation Game (OIG), advocated by Sterrett ( 2000) and the Standard Turing Test (STT), advocated by Moor ( 2001). The rest of the paper is structured as follows.

Footnote 2 My claim is that the QTT (i) improves the experimental design of the TT, by minimising both the Eliza Effect Footnote 3 and the Confederate Effect Footnote 4 and (ii) prevents both Artificial Stupidity Footnote 5 and Blockhead Footnote 6 from passing. In the QTT, the entity Footnote 1 must accomplish a yes/no enquiry in a humanlike and strategic way, where ‘strategic’ means with as few questions as possible. To show this, I propose a new version of the TT, called QTT. My view is that the fault of the TT is one of interpretation and experimental design rather than experimental concept. Even if judges can give scores, in the end any score of humanness is meaningless. the Turing test aims at a quality and not a quantity.


The standard Turing test is not a valid and reliable test for HLMI.
