this post was submitted on 11 Dec 2023
519 points (98.3% liked)
Science Memes
11068 readers
2334 users here now
Welcome to c/science_memes @ Mander.xyz!
A place for majestic STEMLORD peacocking, as well as memes about the realities of working in a lab.
Rules
- Don't throw mud. Behave like an intellectual and remember the human.
- Keep it rooted (on topic).
- No spam.
- Infographics welcome, get schooled.
This is a science community. We use the Dawkins definition of meme.
Research Committee
Other Mander Communities
Science and Research
Biology and Life Sciences
- !abiogenesis@mander.xyz
- !animal-behavior@mander.xyz
- !anthropology@mander.xyz
- !arachnology@mander.xyz
- !balconygardening@slrpnk.net
- !biodiversity@mander.xyz
- !biology@mander.xyz
- !biophysics@mander.xyz
- !botany@mander.xyz
- !ecology@mander.xyz
- !entomology@mander.xyz
- !fermentation@mander.xyz
- !herpetology@mander.xyz
- !houseplants@mander.xyz
- !medicine@mander.xyz
- !microscopy@mander.xyz
- !mycology@mander.xyz
- !nudibranchs@mander.xyz
- !nutrition@mander.xyz
- !palaeoecology@mander.xyz
- !palaeontology@mander.xyz
- !photosynthesis@mander.xyz
- !plantid@mander.xyz
- !plants@mander.xyz
- !reptiles and amphibians@mander.xyz
Physical Sciences
- !astronomy@mander.xyz
- !chemistry@mander.xyz
- !earthscience@mander.xyz
- !geography@mander.xyz
- !geospatial@mander.xyz
- !nuclear@mander.xyz
- !physics@mander.xyz
- !quantum-computing@mander.xyz
- !spectroscopy@mander.xyz
Humanities and Social Sciences
Practical and Applied Sciences
- !exercise-and sports-science@mander.xyz
- !gardening@mander.xyz
- !self sufficiency@mander.xyz
- !soilscience@slrpnk.net
- !terrariums@mander.xyz
- !timelapse@mander.xyz
Memes
Miscellaneous
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I'm going to have to object. We don't use "false positive" and "false negative" as synonyms for Type I and Type II error because they're not the same thing. The difference is at the heart of the misuse of p-values by so many researchers, and the root of the so-called replication crisis.
Type I error is the risk of falsely concluding that the quantities being compared are meaningfully different when they are not, in fact, meaningfully different. Type II error is the risk of falsely concluding that they are essentially equivalent when they are not, in fact, essentially equivalent. Both are conditional probabilities; you can only get a Type I error when the things are, in truth, essentially equivalent and you can only get a Type II error when they are, in truth, meaningfully different. We define Type I and Type II errors as part of the design of a trial. We cannot calculate the risk of a false positive or a false negative without knowing the probability that the two things are meaningfully different.
This may be a little easier to follow with an example:
Let's say we have designed an RCT to compare two treatments with Type I error of 0.05 (95% confidence) and Type II error of 0.1 (90% power). Let's also say that this is the first large phase 3 trial of a promising drug and we know from experience with thousands of similar trials in this context that the new drug will turn out to be meaningfully different from control around 10% of the time.
So, in 1000 trials of this sort, 100 trials will be comparing drugs which are meaningfully different and we will get a false negative for 10 of them (because we only have 90% power). 900 trials will be comparing drugs which are essentially equivalent and we will get a false positive for 45 of them (because we only have 95% confidence).
The false positive rate is 45/135 (33.3%), nowhere near the 5% Type I error we designed the trial with.
Statisticians are awful at naming things. But there is a reason we don't give these error rates the nice, intuitive names you'd expect. Unfortunately we're also awful at explaining things properly, so the misunderstanding has persisted anyway.
This is a useful page which runs through much the same ideas as the paper linked above but in simpler terms: The p value and the base rate fallacy
And this paper tries to rescue p-values from oblivion by calling for 0.005 to replace the usual 0.05 threshold for alpha: Redefine statistical significance.