Home > Citation Search
Damaging the Case for Improving Social Science Methodology Through Misrepresentation: Re-Asserting Confidence in Hypothesis Testing as a Valid Scientific Process
By: James Nicholson, Sean McCusker, Volume 21 (2)
Abstract: This paper is a response to Gorard’s article, “Damaging real lives through obstinacy: re-emphasising why significance testing is wrong” in Sociology Online Review 21(1). For many years Gorard has criticised the way hypothesis tests are used in social science, but recently he has gone much further and argued that the logical basis for hypothesis testing is flawed: that hypothesis testing does not work, even when used properly. We have sympathy with the view that hypothesis testing is often carried out in social science contexts when it should not be, and that outcomes are often described in inappropriate terms, but this does not mean the theory of hypothesis testing, or its use, is flawed per se. There needs to be evidence to support such a contention. Gorard claims “that those who continue to use teach, use or publish significance tests are acting unethically, and knowingly risking the damage that ensues.” This is a very strong statement which impugns the integrity, not just the competence, of a large number of highly respected academics. We believe the evidence he puts forward in this paper does not stand up to scrutiny: that Gorard misrepresents what hypothesis tests claim to do, and uses a sample size he should know is far too small to discriminate properly a 10% difference in means in a simulation he constructs which he then claims to simulate emotive contexts in which a 10% difference would be important to pick up, misrepresenting his simulation as a reasonable model of those contexts.
Is Banning Significance Testing the Best Way to Improve Applied Social Science Research? – Questions on Gorard (2016)
By: Thees F Spreckelsen, Mariska van der Horst, Volume 21 (3)
Abstract: Significance testing is widely used in social science research. It has long been criticised on statistical grounds and problems in the research practice. This paper is an applied researchers’ response to Gorard’s (2016) 'Damaging real lives through obstinacy: re-emphasising why significance testing is wrong' in Sociological Research Online 21(1). He participates in this debate concluding from the issues raised that the use and teaching of significance testing should cease immediately. In that, he goes beyond a mere ban of significance testing, but claims that researchers still doing this are being unethical. We argue that his attack on applied scientists is unlikely to improve social science research and we believe he does not sufficiently prove his claims. In particular we are concerned that with a narrow focus on statistical significance, Gorard misses alternative, if not more important, explanations for the often-lamented problems in social science research. Instead, we argue that it is important to take into account the full research process, not just the step of data analysis, to get a better idea of the best evidence regarding a hypothesis.
Significance Testing is Still Wrong, and Damages Real Lives: A Brief Reply to Spreckelsen and Van Der Horst, and Nicholson and McCusker
By: Stephen Gorard, Volume 22 (2)
Abstract: This paper is a brief reply to two responses to a paper I published previously in this journal. In that first paper I presented a summary of part of the long-standing literature critical of the use of significance testing in real-life research, and reported again on how significance testing is abused, leading to invalid and therefore potentially damaging research outcomes. I illustrated and explained the inverse logic error that is routinely used in significance testing, and argued that all of this should now cease. Although clearly disagreeing with me, neither of the responses to my paper addressed these issues head on. One focussed mainly on arguing with things I had not said (such as that there are no other problems in social science). The other tried to argue either that the inverse logic error is not prevalent, or that there is some other unspecified way of presenting the results of significance testing that does not involve this error. This reply paper summarises my original points, deals with each response paper in turn, and then turns to an examination of how the responders use significance testing in practice in their own studies. All of them use significance testing exactly as I described in the original paper – with non-random cases, and using the probability of the observed data erroneously as though it were the probability of the hypothesis assumed in order to calculate the probability of the observed data.