Albert Einstein was chosen by Time magazine as the Person of the Twentieth Century. It was a good choice (and now is a good time to read Einstein more than seventy years after his death in April 1955, here and here). As noted by Thomas Plümper and Eric Neumayer in their newly published book, The Credibility Crisis in Science: Tweakers, Fraudsters, and the Manipulation of Empirical Results, Einstein finished ahead of the runners-up Franklin Delano Roosevelt, who saved capitalism from itself during the Great Depression, and Mohandas Gandhi, who showed the world that, despite the iron will of Winston Churchill, the sun was setting rapidly on the British Empire. Plümper and Neumayer write “Those days are over (for science). And they may never come back.” This is the theme of their book, and they might be right.
We have discussed the problems of modern science here often, going back to my first contribution here. The goal of The Credibility Crisis in Science is to show how science has gone wrong. It is difficult to disagree with Plümper and Neumayer:
The credibility of science is not, as such, about scientific ideas, theories, or models that turn out to be false. Rather, the crisis has been caused by scientists who deliberately publish overconfident, misleading, and often simply false empirical results based on research designs or model specifications they have intentionally specific to give the desired results. We call this practice “tweaking.” In extreme cases, published results rely on manipulated or outright fabricated data. Whether tweaked, manipulated, or fabricated, the results often cannot be replicated – not even of replication analysts use identical research designs.
…
Tweaking is potentially more damaging to science in the long run than data manipulation and fabrication. Any particular tweaked empirical result is likely to have a smaller effect on the fabric of science than cases of data fabrication and manipulation, but the cumulative effect can still be can still be larger than the cumulative effect of data fabrication and manipulation because these strategies are rare, while tweaking is common.
While they discuss data fabrication and manipulation, which are subsets of the same thing, their view is that tweaking is the larger threat to the integrity of science. Once again they are correct, but tweaking is “useful” in scientific questions of the social sciences, e.g., psychology, sociology, economics, whose experimental approaches are qualitatively and quantitatively different from the natural sciences. My view is that fabrication, manipulation, and tweaking do equal damage to science, while affecting distinct sciences differently. In the basic and clinical natural sciences, fabrication/manipulation will be found out, eventually, and the consequences for the perpetrators will be severe, even if this takes too long most of the time.
Plümper and Neumayer cover various explanations of the behavior of dishonest scientists. In some cases the scientist has the need to be “first” in the pathological winner-take-all environment of “publish and still perish anyway.” In other cases the scientist is (probably) simply inattentive to the actions of other laboratory or team members, although nothing is simple about such lassitude, as the case of the former president of Stanford, Marc Tessier-Lavigne shows. Can such behavior be predicted? Plümper and Neumayer spend a lot of time on this. The answer is, “usually not.” But in the case of Sylvain Lesné (here and here), an early mentor is reported to have dismissed him because of perceived dishonesty in his laboratory in France. This did not stop the young scientist from advancing in his career, however.
Just “making stuff up” as Sylvain Lesné did by manipulating images in his work on the amyloid hypothesis of Alzheimer’s disease (AD) probably did keep AD scientists on the wrong path for nearly twenty years. The opportunity costs remain unknown. Francesca Gino’s apparently fraudulent research (she disputes the findings of her former employer, Harvard University) did less damage but was more audacious, and embarrassing for her credulous collaborators. Diederick Stapel was truly sui generis in his production of fictitious research. Plus, he freely admits to his dishonesty while using several excuses to justify it. Each of these two cases is covered at some length in The Credibility Crisis in Science. It is difficult to treat them as much more than examples of academics gone around the bend, however. [1]
Tweaking is more difficult to detect. One need only consider modern eugenics (not discussed in The Credibility Crisis in Science), which is essentially the same as that of the nineteenth century, except for its supposed sophisticated grounding in genetics. The Genetic Lottery by Kathryn Paige Harden of the University of Texas has been discussed previously here in Prolegomena to an Understanding of the Replication Crisis in Science and later in Hayek’s Bastards and the Rise of Neoliberalism. It is not difficult to argue for the lesser intelligence of “the Other” when tweaking the questions and choosing the proper data used to answer them. Detection of these tweaks requires a strong background in genetics along with the knowledge that general intelligence (g) is not a unitary explanation of human intelligence, especially across different groups that differ primarily in skin pigmentation and/or socioeconomic status. [2]
The so-called “replication crisis of science” is covered well by Plümper and Neumayer, during which they consider the work of John Ioannidis, who originated this trope with a paper in PLOS Medicine in 2005 entitled Why most published research findings are false:
In a widely read essay…the enfant terrible of methodology in medicine…argued that “most published research findings are false,” giving the result away in the title of his essay. He offers no empirical evidence for this claim, which is entirely based on logical (and statistical) reasoning, and his calculations hold only under certain restrictive and unrealistic assumptions. This is fortunate for science because Ioannidis’s argument does not at all rely on data fraud in any way shape or form. He claims that the majority of research findings are false even if they are based on honest research. [3] If his assertion is correct, and if many researchers have indeed additionally fabricated or manipulated data or tweaked their results, then the proportion of false published research is significantly higher than what Ioannidis claims.
…Apparently, Ioannidis has learned one or two lessons from YouTube clickbait titles: Make very bold, headline-grabbing claims, whether they stand up to scrutiny or not. Ioannidis’s logic goes like this: Many empirical studies are under-powered, their sample size is too small, or the effect, if it exists at all, is too small to recover “true positives” defined as effects that truly exist in reality. Combined with testing hypotheses that are rather unlikely to be true, this can easily lead the majority of published articles with statistically significant findings to misleadingly suggest that an effect exists where there is none.
…
We find Ioannidis’s core assumption unrealistic. Although research practices vary from field to field and from area to area, researchers generally tend to test hypotheses that have a fairly high probability of actually being true. Researchers do not, as Ioannidis implicitly assumes, test random hypotheses that have a uniform distribution of actually being true. This is simply a misrepresentation of how scientists work and how they select their projects. Based on well supported theories and existing evidence, they often test hypotheses that have a high probability of being true. As a consequence, the probability of small p-values (and greater statistical significance) is larger, arguably much larger, than Ioannidis assumes.
Ioannidis is called an “enfant terrible” by Plümper and Neumayer. This fits, and he is definitely a gadfly who from 2005 through 2025 averaged 56 publications a year, or about one a week. This is remarkable, and frankly unbelievable if each author on a scientific paper is responsible for the entire content of that paper (that this requirement is often unmet explains much scientific misconduct). His most famous paper from 2005 was also corrected in 2022, which leads one to believe that few readers got past its very useful clickbait title.
Still, a common recommendation for increasing the credibility of science is that published research must be replicated before it is accepted as true or, more correctly, useful. However, without repeating myself, too much, the goal of science is not to produce truth. Truth is for theologians and philosophers (and our modern politicians). The goal of science is to produce useful, factual information that allows us to understand the natural world better. [4] Very few scientific experiments that are modeled on complex systems are precisely replicable. [5] But results that do not allow two questions to grow where there was only one before (Thorstein Veblen) are not useful, whether they are correct or not. Sometimes it is the mistakes in theory or interpretation that lead to deeper understanding of a scientific question.
So, what are the conclusions reached in The Credibility Crisis in Science? For those who feel compelled to cheat for whatever reason, deterrence is unlikely to work. But when a scientist is caught fabricating results, a career will end. Prevention is not as difficult as thought by Plümper and Neumayer, especially when the dishonesty is not perpetrated at the top. A good mentor and good scientist checks and understands every piece of primary data that eventually produces a result that is published. The rule with my students “once is an anecdote, twice is data, and three times is a result. And then we do it all over again from a slightly different perspective, rather than tweaking our conditions to get our hoped-for answer. When the principal investigator is the perpetrator, things get dicey because whistleblowers are not a beloved species. [6] Detection strategies will require access to all primary data by editors and reviewers. Anything less facilitates dishonesty. Plümper and Neumayer conclude that old-fashioned peer review is virtually the only strategy that will work in the long term. They are correct. But for this to happen, the nature of peer review and current business of scientific publication must change.
The Credibility Crisis in Science is well worth the read, but it fails in its goal to properly diagnose the deeper problems of the “scientific enterprise.” As Plümper and Neumayer properly note:
Science has lost some (one might say much) of its standing with the public. While skepticism about scientific findings can be healthy (and it is essential) and is an inherent part of the scientific process, a general disbelief and distrust of scientific findings pose significant challenges. Scientists have a vested interest in regaining some (most) of that lost trust…But much would be gained if scientists were honest about the uncertainties associated with scientific results – honest with other scientists in scientific publications and honest in public statements. Scientists must learn to distinguish between scientific results and their private opinions, and they should promote brutal transparence in scientific research, not hide potential conflicts of interest, and find ways to improve communications between themselves and the public in order to rebuild trust.
Where to begin? The first place would be to define “science” (i.e., scientific research) as the disinterested search for new knowledge about the natural world, from the social psychology of political and religious belief to the structure and function of individual cells in the organism. [7]. While this may be understood to be the case by the authors, they do not take note that it is the scientistic Merchants of Doubt who first sowed distrust in science because it interfered with their interests. They are still with us and they are seldom called out as scientists on a specific mission to “prove something.”
No disinterested scientist, which is the only kind of scientist, objects to skepticism about scientific results until their utility has been demonstrated because they form the foundation for further advances. And no disinterested scientist disputes his or her vested interest in regaining trust that was lost because of the improper use of scientistic gestures on the part of Merchants of Doubt. A disinterested scientist is honest with himself or herself first, last, and always, and not one during my long career has conflated personal opinions with scientific results. Scientists who are not willing and able to share their data with other scientists are not scientists. Several professors of my early acquaintance who were in the Monsanto orbit stopped being disinterested scientists when they began to accept industrial support for their research and parroted the Monsanto line that gave us Roundup Ready commodity crops and herbicide-resistant weeds. This is recognized by Plümper and Neumayer, who nevertheless seem to need to be reminded at times that research and scientific research are not the same thing:
While it is often impossible to demonstrate that single studies suffer from vested interests, considerable evidence exists for bias at the aggregate level. For example, research financed by corporate sponsors is many times more likely to find supportive evidence than research not sponsored by corporations. This holds even when other “sponsorships” are present (Fabbri et al. 2018)
Yes, it does. And this is why the reader should always read the acknowledgments of a scientific paper first. Who pays can influence what is published and some of this research goes by the name of Evidence-Based Medicine. As Matthew G. Saroff commented nearly four years ago:
Much of the skepticism about science is not because people think themselves smarter than the scientist, though some do, but because people think the scientists are corrupt.
In my long experience, “corrupt” is seldom the exact description, but since the Bayh-Dole Act of 1980 the undercurrents of American biomedical science have militated strongly against “disinterested” as the ideal, default description of the typical scientist. One can reasonably say that scientists have lost the plot. Over the past five years the behavior of many scientists during the pandemic did not meet reasonable expectations, usually because the protagonists on multiple sides of the argument about how to respond to COVID-19 were not disinterested in proper, if provisional, path to take in a very difficult situation. Many of them arrived at the argument fully formed, like Athena.
But this is in no way limited to science. Our politics and politicians failed, too. And they have been failing since the Neoliberal Dispensation turned our world into a winner-take-all society. This allows, or impels, both wings of the Uniparty spend their time dialing for dollars and kowtowing to their masters on K Street instead of tending to the business of the republic. As for business and industry, in the 1950s the CEO of General Motors certainly had no love for Walter and Victor Reuther of the United Auto Workers, but he was proud to lead a company that employed several hundred thousand men (and a few women) at more than a living wage. The same is undoubtedly true of the CEO of General Electric. There can be no doubt that both considered themselves rich. Now, our Tech Bros look forward to abolishing those few such jobs that remain and taking it all. It is passing strange that so many of our compatriots are fine with this, which will not end well for them or anyone else.
Thomas Plümper and Eric Neumayer have provided in The Credibility Crisis in Science a useful, if somewhat didactic, overview of what is wrong with American science in the twenty-first century. However, they never really defined what science is, and they mostly left out the social and cultural context that has damaged scientists and science, which are the same that have damaged society as a whole. There are no easy solutions to this problem. Maybe there are no solutions. But as members of a culture and society gone bad, scientists as a group are, in the end, no different from any other group. Until we scientists in the aggregate realize this, nothing can change.
Notes
[1] The case of Gino is covered here and in the linked articles in the piece, while that of Stapel is covered in an extensive Wikipedia entry that summarizes his remarkable case very well. Lesné is no longer a professor at the University of Minnesota.
[2] The Genetic Lottery was reviewed here and the inevitable follow-up discussion can be found here. The popular and improper misuse of science to support modern eugenics is the stock in trade of Charles Murray in The Bell Curve and other works.
[3] This seems to be a distant and unconvincing echo of Against Method by Paul Feyerabend, who was correct that while there is no one scientific method, there is a scientific method.
[4] From Prolegomena to an Understanding of the Replication Crisis in Science: Nancy Cartwright has the much better view, one that is more congenial to the practicing scientist who is paying attention. In her view, “theory and experiment do not a science make.” Yes, science can and has produced remarkable outputs that can be very reliable (the goal of science), “not primarily by ingenious experiments and brilliant theory…(but)…rather by learning, painstakingly on each occasion how to discover or create and then deploy…different kinds of highly specific scientific products to get the job done. Every product of science – whether a piece of technology, a theory in physics, a model of the economy, or a method for field research – depends on huge networks of other products to make sense of it and support it. Each takes imagination, finesse and attention to detail, and each must be done with care, to the very highest scientific standards…because so much else in science depends on it. There is no hierarchy of significance here. All of these matter; each labour is indeed worthy of its hire…Contrary to the conceit of too many scientists, the goal of science is not to produce truth. The goal of science is to produce reliable products that can used to interpret the natural world and react to it as needed, for example, during a worldwide pandemic. This can be done only by appreciating the granularity of the natural world.”
[5] A simple example from my research. My first exercise as a postdoc was to replicate an experiment on the pH-dependence of the binding of my favorite protein to a binding partner in a multicomponent complex. I could not get the experiment to work, so we pivoted with little angst to something else that turned out to be much more useful. As it happened, I had purified my protein from smooth muscle. The previous paper had used the protein purified from human platelets. I later discovered the proteins were not the same. Although they were the same size and behaved similarly as far as we could tell, they were only 72% identical at the amino acid level because they were the products of a gene duplication in our vertebrate ancestors going back 440 million years to fish. What was thought to be a simple system was not. Most biological systems are similarly complex.
[6] For example, in this case, with which I became familiar after the fact, the Principal Investigator (PI) leading his research group manipulated images and lied about materials required to support several grant applications to NIH. He was not found out until be made the mistake of hiring a Research Associate with a PhD. His previous lab members were technicians, never graduate students or research fellows, who did the experiments after which the PI manipulated their data without their knowledge or consent. It is unlikely they had anything to do with the publications or grant applications. It is also absurd that NIH did not seek recompense (~$7M) for the wasted grant money, but that is another issue altogether.
[7] I leave out the physical sciences here because, in general, the absolutism of the atom does not grant much leeway. While chemists can be just as duplicitous as any other human being, their results are too close to well-understood theory to stray far from fact. Drs. Pons and Fleishmann and the University of Utah found this out rather quickly regarding cold fusion. A similar example from biology is the published, and finally retracted, paper by Felisa Wolfe-Simon and others on bacteria that (do not and cannot for chemical reasons) substitute arsenic for phosphorous in the structure of DNA [to go deeper into the weeds, a sugar-arsenate backbone would not be stable in water as is the sugar-phosphate backbone of nucleic acids (jpg)]. How this paper passed peer review remains a mystery to all who have considered the question.



