In recent years, some high-profile studies have passed vetting by anonymous external referees only to have holes poked in them after publication.
Last month, scientists at the Harvard-Smithsonian Center for Astrophysics announced a finding that could be one of the most important scientific discoveries of the 21st Century. BICEP 2, their microwave telescope in the South Pole, detected a distortion that appears to be gravitational waves—ripples in space that are thought to be the first direct confirmation that our universe quickly “inflated” after the Big Bang. The findings, if confirmed, could answer some of science’s most fundamental questions about how the universe was formed.
The announcement hit every major media organization within hours. But nearly all of the news coverage included the same caveat: The BICEP 2 findings have not yet been peer-reviewed.
Peer review at scholarly journals involves recruiting experts to evaluate a paper before it is approved for publication. When a paper is submitted, the editors send it to two or three reviewers who are considered knowledgeable about the topic. The reviewers and the authors, in theory, do not know each others’ identities. If the reviewers raise objections to the methods or conclusions, the authors must revise the paper before it will be accepted for publication. If the objections are significant, the paper is rejected.
Most observers regard non-peer-reviewed results as, at best, preliminary. Instinctively, this makes sense. When a paper is printed in a scientific journal, it acquires the “imprimatur of scientific authenticity” (to quote the physicist John Ziman) and many observers consider its findings to be established scientific facts. It seems like a good idea to subject a paper to expert scrutiny before granting it that sort of status.
But it turns out that peer review is only the scientific community’s most recent method of providing this scrutiny—and it’s worth asking if science is, in fact, “real” only if it’s been approved by anonymous referees.
A few years ago I began writing a book about the history of Nature, one of the world’s most prestigious scientific publications. I was incredibly surprised to learn that Nature published some papers without peer review up until 1973. In fact, many of the most influential texts in the history of science were never put through the peer review process, including Isaac Newton’s 1687 Principia Mathematica, Albert Einstein’s 1905 paper on relativity, and James Watson and Francis Crick’s 1953 Nature paper on the structure of DNA.
Most existing historical accounts claim that peer review began at the Philosophical Transactions of the Royal Society, founded in 1665. And indeed, Henry Oldenburg, the Royal Society secretary who managed the Transactions, did sometimes solicit opinions on papers that he was considering for publication. It would be far too simplistic to say that peer review emerged fully formed from the 17th century, however. Oldenburg consulting his friends about the occasional Transactions paper is a far cry from our current system, which generally involves anonymity and reports from multiple referees.
The first formalized refereeing procedures emerged at scientific societies in the 18th century. In 1731, the Royal Society of Edinburgh began to distribute submissions “according to the subject matter to those members who are most versed in these matters.” By the 19th century, the Royal Society of London consulted referees on nearly all papers submitted to the Transactions. These referees prepared reports on the papers, but authors generally would not see them—the reports were meant to help the editors decide which submissions to print, not to suggest revisions.
Many widely read specialist journals in the 18th and 19th centuries, however, had no systematic refereeing procedures at all. Commercial scientific journals (such as the Philosophical Magazine and Nature) were often run by dynamic editors who felt qualified to evaluate any contribution. Systematic refereeing was even less common outside the English-speaking scientific world. Academic journals in France and Germany, for example, generally trusted the prominent scientists on their editorial boards to make decisions about which papers to print.
Crucially, journals without refereeing processes were not seen as inferior or less “scientific” than those that used referees. Few scientists thought that two anonymous readers would better judge a paper than, say, the great physicist Max Planck (who was on the editorial board of the prominent German journal Annalen der Physik). Scientists unaccustomed to refereeing did not see it as an obviously superior system. In 1936, Albert Einstein—who was used to people like Planck making decisions about his papers without outside opinions—was incensed when the American journal Physical Review sent his submission to another physicist for evaluation. In a terse note to the editor, Einstein wrote:
“I see no reason to address the—in any case erroneous—comments of your anonymous expert. On the basis of this incident I prefer to publish the paper elsewhere.”
It was not until the late 20th century that external refereeing came to be seen as an essential feature of a respectable scientific journal. While historians are still trying to work out the reasons for this change, the new emphasis on peer review (a term that itself originated after the Second World War) seems to have been partly a response to the increased public scrutiny that came with massive Cold War financial investments in science. Scientists used peer review to explain why the public—and their fellow scientists—should trust their work and feel confident giving money to scientific research.
The explosion in the number of papers being submitted to postwar journals may have provided a secondary motivation. Physical Review, for example, went from publishing 2,310 pages in 1940 to publishing 24,544 pages in 1969. Placing more emphasis on referees may have been a way to lessen the burden on editors.
Peer review’s history is of particular interest now because there is an increasing sense in the scientific community that all is not well with the peer review process. In recent years, high-profile papers have passed peer review only to be heavily criticized after publication (such as the 2011 “arsenic DNA” paper in Science that claimed a particular bacterium could incorporate arsenic into its DNA—a finding most biologists have since rejected). Others have been retracted amid allegations of fraud (consider the now-infamous 1998 Lancet paper claiming a link between vaccines and autism). Many scientists worry that requiring approval from colleagues makes it less likely that new or controversial ideas will be published. Nature’s former editor John Maddox was fond of saying that the groundbreaking 1953 DNA paper would never have made it past modern peer review because it was too speculative. In 2011, Great Britain’s House of Commons commissioned a report on the state of peer review. The report concluded that while peer review “is crucial to the reputation and reliability of scientific research,” many scientists believe the system stifles innovation and that “there is little solid evidence on its efficacy.”
If peer review is indeed broken, as some observers have claimed, an important part of fixing it may be adjusting our expectations of it. It seems a bit ambitious to ask any bureaucratic process to distinguish scientific successes from scientific mistakes with total accuracy. Scientific findings will always be questioned after publication and some will ultimately be rejected, including ones by excellent scientists. Although there are good reasons to solicit expert feedback on scientific articles before publication, the conversation about whether something is “real science” does not end when an article reaches print.
Hopefully, this is what will happen with the BICEP 2 gravitational waves: the findings will be written up for publication, referees will offer suggestions and criticisms, and the final paper will have survived some lines of hard inquisition. But we should expect—and hope—that debate about whether the findings are reliable and what they mean for our understanding of the universe will continue long after the referees submit their reports.
This piece originally appeared at Zocalo Public Square.