Understanding P-Hacking Now go out and read the associated article: https://five
ID: 3843818 • Letter: U
Question
Understanding P-Hacking
Now go out and read the associated article: https://fivethirtyeight.com/features/science-isnt-broken/#part1 you might also start clicking on some of the links, at least for things that interest you and/or things that you think might be relevant to answering the questions below. Whenever possible, make references to the exercise you did above, or information from the linked article. For these questions we expect you to form a complete thought… more than a sentence, less than a page.
(a) Are all p-hackers purposefully cheating? Can you imagine doing this by accident? Defend your viewpoint with logic and/or evidence.
(b) Can reasonable people choose different methods and get different results on the same data? Defend your viewpoint with logic and/or evidence.
(c) In spite of all this trouble with reproducing research, the author of this article suggests that science isn’t in trouble. What do you think? Defend your viewpoint with logic and/or evidence.
Explanation / Answer
1. When a distinguished but elderly scientist states that “You have no choice but to accept that the major conclusions of these studies are true,” don’t believe him.
2. The only way of discovering the limits of the reasonable is to venture a little way past them into the unreasonable.
3. Any sufficiently crappy research is indistinguishable from fraud.
It’s the third law that particularly interests me.
On elsewhere, we sometimes get into disputes about untrustworthy research, things like power pose or embodied cognition which didn’t successfully replicate, or things like ovulation and voting or beauty and sex ratio which never had a chance to be uncovered with the experiments in question (that kangaroo thing), or things like the Daryl Bem ESP studied that just about nobody believed in the first place, and then when people looked carefully it turned out the analysis was a ganglion of forking paths. Things like himmicanes and hurricanes, where we’re like, Who knows? Anything could happen? But the evidence presented to us is pretty much empty. Or that air pollution in China where everyone’s like, Sure, we believe it, but, again, if you believe it, it can’t be from the non-evidence in that paper.
What all these papers have in common is that they make serious statistical errors. That’s ok, statistics is hard. As I recently wrote, there’s no reason we should trust a statistical analysis, just because it appears in a peer-reviewed journal. Remember, the problem with peer review is with the “peers”: lots of them don’t know statistics either, and lots of them are motivated to believe certain sorts of claims (such as “embodied cognition” or whatever) and don’t care so much about the quality of the evidence.
And now to Clarke’s updated third law. Some of the work in these sorts of papers is so bad that I’m starting to think the authors are making certain mistakes on purpose. That is, they’re not just ignorantly walking down forking paths, picking up shiny statistically significant comparisons, and running with them. No, they’re actively torturing the data, going through hypothesis after hypothesis until they find the magic “p less than .05,” they’re strategically keeping quiet about alternative analyses that didn’t work, they’re selecting out inconvenient data points on purpose, knowingly categorizing variables to keep good cases on one side of the line and bad ones on the other, screwing around with rounding in order to get p-values from just over .05 to just under . . . all these things. In short, they’re cheating.
Even when they’re cheating, I have no doubt that they are doing so for what they perceive to be a higher cause.
Are they cheating or are they just really really incompetent? When Satoshi Kanazawa publishes yet another paper on sex ratios with sample sizes too small to possibly learn anything, even after Weakliem and I published our long article explaining how his analytical strategy could never work, was he cheating—knowingly using a bad method that would allow him to get statistical significance, and thus another publication, from noise? Or was he softly cheating, by purposely not looking into the possibility that his method might be wrong, just looking away so he could continue to use the method? Or was he just incompetent, try to do the scientific equivalent of repairing a watch using garden shears? Same with Daryl Bem and all the rest. I’m not accusing any of them of fraud! Who knows what was going through their mind when they were doing what they were doing.
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.