In 1942, sociologist Robert Merton described the ethos of science in terms of its four key values: The first, universalism, meant the rules for doing research are objective and apply to all scientists, regardless of their status. The second, communality, referred to the idea that findings should be shared and disseminated. The third, disinterestedness, described a system in which science is done for the sake of knowledge, not personal gain. And the final value, organized skepticism, meant that claims should be scrutinized and verified, not taken at face value. For scientists, wrote Merton, these were “moral as well as technical prescriptions.”

In his new book, Science Fictions: How Fraud, Bias, Negligence, and Hype Undermine the Search for Truth, Stuart Ritchie endorses the above as a model for how science is meant to work. “By following the four Mertonian Norms, we should end up with a scientific literature that we can trust,” he writes. He then proceeds to spend the rest of the book explaining all the ways in which modern science fails to do just this.

Image may contain Text

Science Fictions by Stuart Ritchie | Buy on Amazon

Designed by Catherine Casalino

Ritchie is a psychologist at King’s College London and the author of a previous book, Intelligence: All That Matters, about IQ testing. In Science Fictions he presents a broad overview of the problems facing science in the 21st century. The book covers everything from the replication crisis to fraud, bias, negligence and hype. Much of his criticism is aimed at his own field of psychology, but he also covers these issues as they occur in other fields such as medicine and biology.

Underlying most of these problems is a common issue: the fact that science, as he readily concedes, is “a social construct.” Its ideals are lofty, but it’s an enterprise conducted by humans, with all their foibles. To begin with, the system of peer-reviewed funding and publication is based on trust. Peer review is meant to look for errors or misinterpretations, but it’s done under the assumption that submitted data are genuine, and that the description of the methods used to obtain them are accurate.

Ritchie recounts how in the 1970s, William Summerlin, a dermatologist at the Memorial Sloan-Kettering Cancer Center, used a black felt-tipped pen to fake a procedure in which he’d purported to graft the skin from a black mouse onto a white one. (He was caught by a lab tech who spotted the ink and rubbed it off with alcohol.) Fraudulent studies like Summerlin’s are not one-off events. A few recent examples that Ritchie cites are a researcher who was caught faking cloned embryos, another found to be misrepresenting results from trachea implant surgeries, and a third who fabricated data in a study purporting to show that door-to-door canvassing could shift people’s opinions on gay marriage. With the rise of digital photography, scientists have manipulated images to make their data comply with their expectations; one survey of the literature found signs of image duplication in about 4 percent of some 20,000 papers examined.

But even when they’re not committing fraud, scientists can easily be influenced by biases. One of the revelations to come from psychology’s reckoning with its replication problem is that standard statistical methods for preventing bias are in fact subject to manipulation, whether intentional or not. The most famous example of this is p-hacking, where researchers conduct their analysis in a way that produces a favorable p-value, a much-abused and misunderstood statistic that reveals something about the likelihood of getting the result you saw if there wasn’t actually a real effect. (Ritchie’s footnote for p-hacking links to my WIRED story about how the phrase has gone mainstream)

An overreliance on p-values helps explain the spread of studies showing “social priming,” where subtle or subconscious cues were said to have large effects on people’s behavior. For instance, one study claimed that when people read words associated with old people (like old or gray), it made them walk more slowly down a hallway afterwards. A functional bullshit meter would have flagged this finding, and many others like it, as suspicious; but when they’re wrapped in the language of science, with an authoritative p-value and the peer-review stamp of approval, they gain a measure of credibility.