Why Are Scientists So Often Wrong? Publishing Manipulated Results from Unreplicable Experiments Is A Big Part Of The Problem.
In a year where scientists seemed to have gotten everything wrong, a book attempting to explain why is bizarrely relevant. Of course, science was in deep trouble long before the pandemic began and Stuart Ritchie’s excellent Science Fictions: How Fraud, Bias, Negligence, and Hype Undermine the Search for Truth had been long in the making. Much welcomed, nonetheless, and very important.
For a contrarian like me, reading Ritchie is good for my mental sanity – but bad for my intellectual integrity. It fuels my priors that a lot of people, even experts, delude themselves into thinking they know things they actually don’t. Fantastic scientific results, either the kind blasted across headlines or those which gradually make it into public awareness, are often so poorly made that the results don’t hold up; they don’t capture anything real about the world. The book is a wake-up call for a scientific establishment often too blinded by its own erudite proclamations.
Filled with examples and accessible explanations, Ritchie expertly leads the reader on a journey through science’s many troubles. He categorizes them by the four subtitles of the book: fraud, bias, negligence, and hype. Together, they all undermine the search for truth that is science’s raison d’être. It’s not that scientists willfully lie, cheat, or deceive – even though that happens uncomfortably often, even in the best of journals – but that poorly designed experiments, underpowered studies, spreadsheet errors or intentionally or unintentionally manipulated p-values yield results that are too good to be true. Since academics’ careers depend on publishing novel, fascinating and significant results, most of them don’t look a gift horse in the mouth. If the statistical software says “significant,” they confidently write up the study and persuasively argue their amazing case before a top-ranked journal, its editors, and the slacking peers in the field who are supposed to police their mistakes.
Ritchie isn’t some crackpot science denier or conspiracy theorist working out of his mom’s basement; he’s a celebrated psychologist at King’s College London with lots of experience in debunking poorly-made research, particularly in his own field of psychology. For the last decade or more, this discipline has been the unfortunate poster child for the “Replication Crisis,” the discovery that – to use Stanford’s John Ioannidis’ well-known article title – “Most Published Research Findings Are False.”
Take the example of former Cornell psychology professor Daryl Bem and his infamous “psychic pornography” experiment that opens Ritchie’s book. On screens, a thousand undergraduates were shown two curtains, only one of which hid an image that the students were supposed to find. The choice was a coin toss, as they had no other information to go on. As expected, for most kinds of images they picked the right curtain about 50% of the time. But – and here was Bem’s claim to fame – when pornographic images hid behind the curtails, students choose the right one 53% of the time, enough to pass for statistical significance in his sample. The road for top-ranked publication was wide open.
When the article came out after passing peer review, the world was stunned to learn that undergrads could see the future – at least when images of a sexual nature were involved. Proven by science, certified by The Scientific Method™, the psychology world was thrown into chaos. The study was done properly, passed peer review, and published in a top field journal, with the same method that underlies all the other well-known results in the field. Still, the result was totally bonkers. What had gone wrong?
Or take the don of behavioral economics, Daniel Kahneman, whose many quirky experiments convinced an entire economics profession of individual irrationality and ultimately earned him the Nobel Prize. The psychological literature on so-called ‘priming,’ part of which is used by behavioral economists, suggested that tiny changes in settings can produce remarkably large impacts in behavior. For instance, subtly reminding people of money – through symbols or the clicking noise of coins – makes them behave more individualistically and less caring of others. “Disbelief is not an option,” wrote Kahneman in his famous best-seller Thinking, Fast and Slow, “you have no choice but to accept that the major conclusions of these [priming] studies are true.”
Beginning in the 2010s, psychologists tried to replicate these famous results and more. When tried elsewhere, with other students, better equipment, or larger samples – or sometimes with the exact same data – the same results wouldn’t emerge. How odd. Lab teams tried to replicate many established findings, coming up way short: “The replication crisis seems,” writes Ritchie, “with a snap of its fingers, to have wiped about half of all psychology research off the map.” There was something structurally wrong in the way that psychology found and displayed knowledge. Some research.
Chance encounters, like Bem’s supernatural students, sometimes make it through into published literature. More disheartening are the actual instances of fraud, where scientists forge their data, manipulate them, or simply make them up out of thin air. Ritchie’s many stories can make you lose faith in many a scientific establishment: scientists inventing spreadsheets (caught only because humans are very bad at creating true randomness), tilting microscope pictures sideways, reusing the same numbers while pretending they were another data set.
While everyone agrees that fraud is a problem, and the challenge is to prevent it or detect it before it causes too much damage, the other flaws (bias, negligence, and hype) are more widespread – and more damaging because of it. They operate in more subtle ways, out of sight and impossible for outsiders to adjust to. Take the file-drawer problem, where negative results are stuffed away while positive results – most commonly obtained by chance, like Bem’s psychic experiment – are sent off for publication, giving a false impression of the state of the world, both in the literature and for the wider public.
What’s fascinating in Ritchie’s book are the discussions of many studies, claims, and experiments with which even non-experts are familiar. Well-referenced and comprehensively cited, Ritchie reports huge problems with the following hyped stories:
- Larger plates make you eat more.
- Going to the supermarket hungry makes you buy more calories.
- Eggs cause cardiovascular disease.
- In messy or dirty environments people display more racial stereotypes.
- Power posing (manspreading or placing your hands aggressively on your hips) creates a psychological and hormonal boost that correlates with higher risk tolerance and better life outcomes.
- Philip Zimbardo’s Stanford Prison Experiment and the inhuman cruelty by people in authority ( debunked perhaps most effectively by Gina Perry’s many in-depth writings on famous psychology experiments).
- Sleeping less than six hours a night “demolishes your immune system [,] doubling your risk of cancer,” as the best-selling book Why We Sleep by Matthew Walker claimed.
All wrong. Every one of these much-publicized and discussed claims include at least one of the following: misleading conclusions not warranted by the research itself; fabricated data; data manhandled to pass significance tests; incompetent experimental designs; or experiments that wouldn’t replicate when tried by other scientists. Taking them apart for a non-expert audience is where Ritchie really shines.
We’re not surprised that news headlines misunderstand, exaggerate, or fail to report nuance, but Ritchie shows that even the published literature that supports these claims have detrimental flaws, undermining their results. As far as the rest of the world is concerned, that hasn’t mattered much. For these claims, the cat was out of the bag. Many of their results have reached the nonscientific public and entered “common knowledge.” I have personally had three different people, on different occasions, inform me about the dangers of eating eggs – two of them in doctoral programs at some of the most prestigious universities in the world. Being sharp and being right are two very different things.
That makes me think that Hype is the worst of science’s many sins as trigger-happy researchers (or even admired institutions like NASA) write puffy press releases on some revolutionary claim that turns out to be fraudulent, negligent, poorly made, underpowered or just flat-out unsupported by its own research.
Some of these eerie stories of mistaken research have serious outcomes in the real world: examples include Reinhart and Rogoff’s government debt-inhibits-growth error, Paolo Macchiarini’s fraudulent endeavors operating on patients at Karolinska Institute, or the wholly made-up research that suggested the combined measles, mumps and rubella vaccine caused autism. Even smaller and comparatively more innocent errors like p-hacking, shifting outcome goalposts, or underpowered studies with much-too-large effects hurt science and make the world a worse place, as doctors and policy-makers use them in decision-making.
At one point in the Bias chapter, Ritchie himself loses hope, pondering how to overcome all these statistical and human-made failures to produce accurate knowledge about the world: “My response is that I have no idea,” he writes.
Somehow he still ends on a slightly more positive note. The last two chapters provide many hopeful suggestions for how science can improve on its many challenges: we can fund research differently; journals can pre-commit to publication if the study design is good enough, publishing more negative results; we can pre-register methods such that researchers can’t shift the target variable mid-study; we can withhold some grant money until publication, to financially punish researchers that file-drawer their unsuccessful results.
More refreshingly is computer technology and the rampant transparency it allows. Entire datasets can be put online and codes can be analyzed line-by-line by many more than the handful of peer reviewers and editors that (should) usually do so. Besides, the reason we unearthed so many fraudsters in the first place was by using clever algorithms that found inconsistencies in the statistical outputs of reported results.
Ritchie warns against the nihilism of becoming “suspicious of any and all new results, given our knowledge that the stream of scientific progress is far from pure.” A healthy dose of skepticism is good – throwing the baby out with any polluted bathwater isn’t. Science is “one of humanity’s proudest achievements,” he proclaims, and simply because a lot of it is wrong, false, puffed, or misleading, it doesn’t mean that it never correctly identified anything important. On the contrary.
While I see myself falling into precisely the trap Ritchie fears – that people will misuse his book to deny even well-established scientific results – he worries more about the opposite problem. Correctly so: People, especially in the West, place an extraordinarily high trust in scientists – reaching over 90% in some countries. In the U.K., for instance, the population seems to have grown more trusting of science and its results over time.
The book, while scary and disheartening, is truth-seeking and ultimately optimistic. Ritchie doesn’t come to bury science; he comes to fix it. “The ideals of the scientific process aren’t the problem,” he writes on the last page, “the problem is the betrayal of those ideals by the way we do research in practice.”