# The earth is round (p < .05)

@article{Cohen1994TheEI, title={The earth is round (p < .05)}, author={Jacob Cohen}, journal={American Psychologist}, year={1994}, volume={49}, pages={997-1003} }

After 4 decades of severe criticism, the ritual of null hypothesis significance testing (mechanical dichotomous decisions around a sacred .05 criterion) still persists. This article reviews the problems with this practice, including near universal misinterpretation of p as the probability that H₀ is false, the misinterpretation that its complement is the probability of successful replication, and the mistaken assumption that if one rejects H₀ one thereby affirms the theory that led to the test… Expand

#### 3,725 Citations

The Earth Is Not Round (p = .00)

- Business
- 2011

Continued discussion and debate regarding the appropriate use of null hypothesis significance testing (NHST) has led to greater reliance on effect size testing (EST) in published literature. This… Expand

The earth is flat (p > 0.05): significance thresholds and the crisis of unreplicable research

- Medicine, Psychology
- PeerJ
- 2017

The widespread use of ‘statistical significance’ as a license for making a claim of a scientific finding leads to considerable distortion of the scientific process, and potential arguments against removing significance thresholds are discussed. Expand

Testing Significance Testing

- Psychology
- 2018

The practice of Significance Testing (ST) remains widespread in psychological science despite continual criticism of its flaws and abuses. Using simulation experiments, we address four concerns about… Expand

The Earth is spherical ( p < 0 : 05 ) : alternative methods of statistical inference

- 2000

A literature review was conducted to understand the limitations of well-known statistical analysis techniques, particularly analysis of variance. The review is structured around six major points: (1)… Expand

How significant (p < 0.05) is geomorphic research?

- Computer Science
- 2014

The pervasive application of the Null Hypothesis Significance Test in geomorphic research runs counter to widespread, long running, and often severe criticism of the method in the broader scientific… Expand

The Earth Is Not Round ( p 1⁄4 . 00 )

- 2011

Continued discussion and debate regarding the appropriate use of null hypothesis significance testing (NHST) has led to greater reliance on effect size testing (EST) in published literature. This… Expand

Manipulating the Alpha Level Cannot Cure Significance Testing

- Psychology, Medicine
- Front. Psychol.
- 2018

We argue that making accept/reject decisions on scientific hypotheses, including a recent call for changing the canonical alpha level from p = 0.05 to p = 0.005, is deleterious for the finding of new… Expand

Assessing environmentally significant effects: a better strength-of-evidence than a single P value?

- Mathematics, Medicine
- Environmental Monitoring and Assessment
- 2013

A strength-of-evidence procedure that lends itself to a simple confidence interval interpretation and is accompanied by a strength- of-evidence matrix that has many desirable features: not only a strong/moderate/dubious/weak categorisation of the results, but also recommendations about the desirability of collecting further data to strengthen findings. Expand

Non-significant results in ecology: a burden or a blessing in disguise?

- Psychology
- 2003

Null hypothesis significance testing remains a common practice in ecology, despite criticism by statisticians (Yoccoz 1991, Cohen 1994) and numerous suggested alternatives (Jones and Matloff 1986,… Expand

Sherlock Holmes and the Death of the Null Hypothesis

- Psychology
- 2014

In the eighty years since R.A. Fisher’s original work, null hypothesis significance testing has become the ubiquitous research methodology in fields as diverse as biology, agronomy, social science,… Expand

#### References

SHOWING 1-10 OF 65 REFERENCES

p values, hypothesis tests, and likelihood: implications for epidemiology of a neglected historical debate.

- Psychology, Medicine
- American journal of epidemiology
- 1993

An analysis using another method promoted by Fisher, mathematical likelihood, shows that the p value substantially overstates the evidence against the null hypothesis. Expand

Theoretical risks and tabular asterisks: Sir Karl, Sir Ronald, and the slow progress of soft psychology.

- Psychology
- 1978

Abstract Theories in “soft” areas of psychology lack the cumulative character of scientific knowledge. They tend neither to be refuted nor corroborated, but instead merely fade away as people lose… Expand

Appraising and Amending Theories: The Strategy of Lakatosian Defense and Two Principles that Warrant It

- Psychology
- 1990

In social science, everything is somewhat correlated with everything (“crud factor”), so whether H0 is refuted depends solely on statistical power. In psychology, the directional counternull of… Expand

The test of significance in psychological research.

- Medicine, Psychology
- Psychological bulletin
- 1966

The test of significance does not provide the information concerning psychological phenomena characteristically attributed to it; and a great deal of mischief has been associated with its use. The… Expand

Consequences of Prejudice Against the Null Hypothesis

- Psychology
- 1975

The consequences of prejudice against accepting the null hypothesis were examined through (a) a mathematical model intended to stimulate the research-publication process and (b) case studies of… Expand

Do studies of statistical power have an effect on the power of studies

- Psychology
- 1989

The long-term impact of studies of statistical power is investigated using J. Cohen's (1962) pioneering work as an example. We argue that the impact is nil; the power of studies in the same journal… Expand

The fallacy of the null-hypothesis significance test.

- Psychology, Medicine
- Psychological bulletin
- 1960

To the experimental scientist, statistical inference is a research instrument, a processing device by which unwieldy masses of raw data may be refined into a product more suitable for assimilation into the corpus of science, and in this lies both strength and weakness. Expand

Significance Tests Die Hard

- Psychology
- 1995

We present a critique showing the flawed logical structure of statistical significance tests. We then attempt to analyze why, in spite of this faulty reasoning, the use of significance tests… Expand

Statistical significance in psychological research.

- Psychology, Medicine
- Psychological bulletin
- 1968

Sapolsky (1964) developed the following substantive theory: Some psychiatric patients entertain an unconscious belief in the "cloacal theory of birth" which involves the notions of oral impregnation and anal parturition, which led Sapolsky to predict that Rorschach frog responders show. Expand

Theory-Testing in Psychology and Physics: A Methodological Paradox

- Psychology
- Philosophy of Science
- 1967

Because physical theories typically predict numerical values, an improvement in experimental precision reduces the tolerance range and hence increases corroborability. In most psychological research,… Expand