TIME Research

Only a Third of Psych Studies Are Reliable. Now What?

woman-finger-pointing-books
Getty Images

We must stop treating single studies as unassailable authorities of the truth

The ability to repeat a study and find the same results twice is a prerequisite for building scientific knowledge. Replication allows us to ensure empirical findings are reliable and refines our understanding of when a finding occurs. It may surprise you to learn, then, that scientists do not often conduct – much less publish – attempted replications of existing studies.

Journals prefer to publish novel, cutting-edge research. And professional advancement is determined by making new discoveries, not painstakingly confirming claims that are already on the books. As one of our colleagues recently put it, “Running replications is fine for other people, but I have better ways to spend my precious time.”

Once a paper appears in a peer-reviewed journal, it acquires a kind of magical, unassailable authority. News outlets, and sometimes even scientists themselves, will cite these findings without a trace of skepticism. Such unquestioning confidence in new studies is likely undeserved, or at least premature.

A small but vocal contingent of researchers – addressing fields ranging from physics to medicine to economics – has maintained that many, perhaps most, published studies are wrong. But how bad is this problem, exactly? And what features make a study more or less likely to turn out to be true?

We are two of the 270 researchers who together have just published in the journal Science the first-ever large-scale effort trying to answer these questions by attempting to reproduce 100 previously published psychological science findings.

Attempting to re-find psychology findings

Publishing together as the Open Science Framework and coordinated by social psychologist Brian Nosek from the Center for Open Science, research teams from around the world each ran a replication of a study published in three top psychology journals – Psychological Science; Journal of Personality and Social Psychology; and Journal of Experimental Psychology: Learning, Memory, and Cognition. To ensure the replication was as exact as possible, research teams obtained study materials from the original authors, and worked closely with these authors whenever they could.

Almost all of the original published studies (97%) had statistically significant results. This is as you’d expect – while many experiments fail to uncover meaningful results, scientists tend only to publish the ones that do.

What we found is that when these 100 studies were run by other researchers, however, only 36% reached statistical significance. This number is alarmingly low. Put another way, only around one-third of the rerun studies came out with the same results that were found the first time around. That rate is especially low when you consider that, once published, findings tend to be held as gospel.

The bad news doesn’t end there. Even when the new study found evidence for the existence of the original finding, the magnitude of the effect was much smaller — half the size of the original, on average.

One caveat: just because something fails to replicate doesn’t mean it isn’t true. Some of these failures could be due to luck, or poor execution, or an incomplete understanding of the circumstances needed to show the effect (scientists call these “moderators” or “boundary conditions”). For example, having someone practice a task repeatedly might improve their memory, but only if they didn’t know the task well to begin with. In a way, what these replications (and failed replications) serve to do is highlight the inherent uncertainty of any single study – original or new.

More robust findings more replicable

Given how low these numbers are, is there anything we can do to predict the studies that will replicate and those that won’t? The results from this Reproducibility Project offer some clues.

There are two major ways that researchers quantify the nature of their results. The first is a p-value, which estimates the probability that the result was arrived at purely by chance and is a false positive. (Technically, the p-value is the chance that the result, or a stronger result, would have occurred even when there was no real effect.) Generally, if a statistical test shows that the p-value is lower than 5%, the study’s results are considered “significant” – most likely due to actual effects.

Another way to quantify a result is with an effect size – not how reliable the difference is, but how big it is. Let’s say you find that people spend more money in a sad mood. Well, how much more money do they spend? This is the effect size.

We found that the smaller the original study’s p-value and the larger its effect size, the more likely it was to replicate. Strong initial statistical evidence was a good marker of whether a finding was reproducible.

Studies that were rated as more challenging to conduct were less likely to replicate, as were findings that were considered surprising. For instance, if a study shows that reading lowers IQs, or if it uses a very obscure and unfamiliar methodology, we would do well to be skeptical of such data. Scientists are often rewarded for delivering results that dazzle and defy expectation, but extraordinary claims require extraordinary evidence.

Although our replication effort is novel in its scope and level of transparency – the methods and data for all replicated studies are available online – they are consistent with previous work from other fields. Cancer biologists, for instance, have reported replication rates as low as 11%25%.

We have a problem. What’s the solution?

Some conclusions seem warranted here.

We must stop treating single studies as unassailable authorities of the truth. Until a discovery has been thoroughly vetted and repeatedly observed, we should treat it with the measure of skepticism that scientific thinking requires. After all, the truly scientific mindset is critical, not credulous. There is a place for breakthrough findings and cutting-edge theories, but there is also merit in the slow, systematic checking and refining of those findings and theories.

Of course, adopting a skeptical attitude will take us only so far. We also need to provide incentives for reproducible science by rewarding those who conduct replications and who conduct replicable work. For instance, at least one top journal has begun to give special “badges” to articles that make their data and materials available, and the Berkeley Initiative for Transparency in the Social Sciences has established a prize for practicing more transparent social science.

Better research practices are also likely to ensure higher replication rates. There is already evidence that taking certain concrete steps – such as making hypotheses clear prior to data analysis, openly sharing materials and data, and following transparent reporting standards – decreases false positive rates in published studies. Some funding organizations are already demanding hypothesis registration and data sharing.

Although perfect replicability in published papers is an unrealistic goal, current replication rates are unacceptably low. The first step, as they say, is admitting you have a problem. What scientists and the public now choose to do with this information remains to be seen, but our collective response will guide the course of future scientific progress.

This article originally appeared on The ConversationThe Conversation

TIME space

See Breathtaking Views of the National Parks From Space

Yosemite, Redwood, and other famous parks as they look from outer space

TIME climate change

Here’s What Those Weird Blue Clouds Mean

Photographs of noctilucent clouds appearing in the night sky over Britain on July, 2009.
Jamie Cooper—Getty Images Photographs of noctilucent clouds appearing in the night sky over Britain on Jul. 15, 2009.

They could be related to climate change

If you’ve noticed some strange blue clouds in the night sky recently, you’re not alone. Uncharacteristically blue nighttime clouds, usually seen over polar regions, have been visible as far south as Colorado and Northern California in recent years.

The clouds, known as “noctilucent clouds” or NLCs, glow blue at night because tiny ice crystals 50 miles (80 km) above the earth are reflecting sunlight from the other side of the planet, according to SFGate. And some scientists say the glowing blue clouds may be yet another effect of climate change.

The vast majority of scientists agree that climate change is real, but NLCs are a good example of how sometimes the secondary effects of climate change may not yet be completely understood. It’s not clear exactly what the glowing clouds have to do with a changing climate, though there are some theories being discussed. One is that methane emissions can create water droplets at high altitudes, which can lead to NLCs, SpaceWeather.com’s Tony Phillips told SFGate. Another idea is that as the Earth’s surface is heating up, the higher layers of the atmosphere (like the mesosphere, where NLCs form) are actually getting colder, allowing the tiny ice crystals for form, University of Colorado Professor Gary Thomas told NASA in 2003.

So if you see blue clouds glowing at night, it may be yet another effect of climate change.

TIME space travel

A Japanese Drinks Company Just Sent Some Whiskey to the International Space Station

But the astronauts won't get to drink a drop

A Japanese resupply spacecraft successfully docked to the International Space Station (ISS) on Monday, and on board there was some unusual cargo.

Included in the 10,000 lb. of supplies were five whiskey samples sent into orbit by Japanese alcoholic-drinks conglomerate Suntory, reports the Associated Press.

But astronauts on board the ISS won’t be able to drink a drop of the liquor, which was sent as part of an experiment to see whether spirits mellow at the same pace in microgravity as they do on earth.

The research is being conducted in the Japanese Experiment Module on the ISS and researchers at Suntory hope the experiments will help find a scientific explanation for the “mechanism that makes alcohol mellow.”

An identical set of samples is being stored in Japan and after a year or so the samples in orbit will return to earth to be compared, analyzed and tasted.

The whiskey experiment isn’t the first drinks-related study to take place on the ISS. Already on board are specially designed coffee cups that have revolutionized how astronauts drink in space and could help scientists build better and safer advanced fluid systems.

And on earth a company called Cosmic Lifestyle Corp. has even invented a zero-gravity-friendly martini glass.

TIME Science

The Strange Story of the First People to Die From Nuclear Weapons During Peacetime

LOS ALAMOS GATE
AP A Feb. 25,1955 view of the well-guarded Los Alamos, N.M. birthplace of the A-bomb and other thermonuclear weapons.

Seventy years ago, a young physicist made a tragic mistake

The first wartime deaths from nuclear weaponry were vast in number and world-changing in scope. The first peacetime deaths from that same technology were far quieter incidents, free of violence but still illustrative of the awful power of the bomb.

The physicist Louis Slotin was part of the team that figured out how much nuclear material (plutonium and uranium) would be needed for the bombs used at the end of World War II. And as Richard L. Miller explained in his 1986 book Under the Cloud, Slotin wanted to see his work through to the end by accompanying the pilots who dropped the bomb, but he wasn’t given permission. Frustrated, he decided to go on vacation instead and leave his young assistant, Harry Daghlian, to continue his experiments while he was away.

On Aug. 21, 1945, Daghlian was stacking tungsten carbide bricks as a reflector around a plutonium core, according to a Los Alamos report on nuclear accidents. The idea was that the tungsten would reflect neutrons, meaning that you’d need less plutonium to get a nuclear reaction going. Daghlian’s lab instruments showed him that the next tungsten brick he added to the stack would bring the experiment to a critical point (ie. the reaction would begin). He moved to take the brick away, but his hand slipped and it fell into the middle of the stack. Though he quickly took the assembly apart, it was too late.

According to the American Physical Society, Daghlian’s skin turned red and began to peel off as his gastrointestinal system started to fail. He died that September. In a cruel twist, the next person to be killed by peacetime atomic science, in a similar mishap the next year, was Louis Slotin.

After that death, TIME (misidentifying Slotin as “the first peacetime victim of nuclear fission”) explained the science behind such accidents:

Apparently Dr. Slotin and seven or more other scientists were working with “subcritical masses” of uranium or plutonium. Kept apart, these masses were lifeless as lead, but if brought together to form a mass above “critical” size, a chain reaction would start. Its violence would depend on the character of the materials. Probably they were midway in activity between mild-mannered natural uranium and furious plutonium 239.

Bringing such “reactors” together is touchy business. The scientists work with infinite caution, watching instruments which measure the number of free neutrons within the experimental mass. Under some conditions, the chain reaction starts slowly. But sometimes it leaps into violence in a millionth of a second. There is no explosion, no vibration, no sound. No human sense can detect the outburst of deadly radiation. The only warning, which comes too late, is a faint bluish glow. Some experts think it is caused by ionization of the air; others believe it to be an optical illusion telegraphed to the brain by stimulated nerves behind the eyes.

Slotin suffered radiation burns while breaking apart the materials. According to a letter to the editor about that article, Slotin knew that he was sure to die within days but still returned to the lab to explain his work to the staff, “not wanting any knowledge to die with him.”

As for that staff, they had learned from the mistakes of those before them and established new safety protocols.

TIME space

Here are the Most Heart-Stopping Photos of Saturn from the Cassini Mission

As the spacecraft completes its final flyby of Saturn's moon Dione, TIME reflects on the most spectacular images from the mission thus far

TIME Underwear

This Underwear Can Protect You From Radiation Exposure

Courtesy Wireless Armour

It'll keep your swimmers swimming.

Wireless Armour is a new line of underwear for men, developed by British scientist Joseph Perkins, that supposedly protects your unmentionables from 99.9% of electromagnetic radiation.

Results of a study posted on the Wireless Armour website showed that radiation emitted from WiFi devices, like the smartphone in your pocket or the computer on your lap, killed 1/4 of the sperm sample after just 4 hours of exposure, proving that all your Tinder and OK Cupid activity might seriously damage your fertility.

The underwear is made from a fabric called RadiaText, which is a blend of cotton and silver. Because silver is a top conductor of electricity, it creates an “unbroken shield” against the radiation, according to the website. The blend makes it both soft and strong, and the silver makes the underwear “highly anti-microbial,” ergo more hygienic.

Prices range from £24 to £35 (about $37 to $55), but can you really put a price on your crown jewels?

TIME Science

Our Power Grid Is More Vulnerable to Space Weather Than We Thought

The entire planet could be affected

The Earth’s magnetic field – known as the “magnetosphere” – protects our atmosphere from the “solar wind.” That’s the constant stream of charged particles flowing outward from the sun. When the magnetosphere shields Earth from these solar particles, they get funneled toward the polar regions of our atmosphere.

As the particles crash into the atmosphere’s ionospheric layer, light is given off, creating beautiful multicolored displays of aurora near both the North and South Poles. These are stunning visual representations of the complex interactions in the near-Earth space environment, which we collectively term “space weather.”

The same space weather that generates these beautiful displays can cause havoc for a wide range of technologies. We’ve known for a while that space weather in high-latitude regions near the poles can cause power grid failures, sometimes causing heavy damage. The most famous instance was the March 1989 blackout in the Northeastern US and up through Quebec, Canada that left millions without power for 12 hours.

But we haven’t thought of equatorial regions as being prime targets. Our new research shows that areas closer to the equator still experience bad space weather – and its disturbing effects on power grid infrastructure.

Changing magnetic fields crank up electric currents

High above the ground in the upper atmosphere are fluctuating electric currents driven by interactions in the magnetosphere and ionosphere. These atmospheric currents cause strong changes in the strength of the local magnetic field on the ground. We can’t feel the magnetic field ourselves, but researchers measure and track it at various points on the Earth’s surface.

That’s all well and good. The problem comes in when these atmospheric currents cause swift changes in the magnetic field. When the magnetic field abruptly changes, it can generate electric currents in conductors at the Earth’s surface – for instance, long pipes or wires such as oil and gas pipelines or power transmission lines. This process of electric current generation is called magnetic induction.

These electric currents are not-so-creatively called geomagnetically induced currents, or GICs for short. The high-latitude regions are most susceptible to GICs because of the intense electric currents flowing through the auroras, thanks to the way the solar wind gets diverted when it hits the Earth’s magnetosphere. However, the entire planet can be affected to varying degrees.

When they occur, GICs effectively generate extra electric current in power grid infrastructure through magnetic induction. Power grids, during large events, can end up taking on more electricity than they can handle. These induced currents have caused numerous equipment failures that have led to power outages for large populations.

Trouble at the equator too, not just near the poles

Those same geomagnetically induced currents that happen in the high-latitude regions can happen around the equator of our planet too. There, they are caused not by the auroral electric current system we find near the poles, but by a weaker low-latitude counterpart called the equatorial electrojet. Like the high-latitude ionospheric current system, the equatorial electrojet’s electric current can be detected on the ground using magnetic field observations.

Recently researchers reported that GIC activity is enhanced at the equator during severe geomagnetic storms – that’s when solar eruptions called “coronal mass ejections” trigger shock waves that hit the Earth. They pointed the finger at the equatorial electrojet as a suspected cause.

In our new research article in Geophysical Research Letters, we show that countries near the magnetic equator are more vulnerable to space weather than previously thought.

Rather than focusing on severe geomagnetic storms, such as the 2003 Halloween event that caused power grid problems in Sweden (among many other things), we took a different tack. Our analysis focused on the arrival of interplanetary shocks. These are abrupt pressure increases in the solar wind – that stream of plasma constantly flowing out of the sun. When these shocks hit the Earth’s magnetosphere, the impact causes a sudden magnetic field change that can be measured all over the world.

Interplanetary shocks regularly announce the beginning of a geomagnetic storm. But many pass by relatively benignly without developing into a full-blown geomagnetic storm. We noticed that the magnetic response to these shock arrivals was sometimes significantly stronger at the magnetic equator when compared to locations only a few degrees away. Why?

An analysis of how these equatorial responses differed throughout the day revealed they were strongest around noon and weakest at night. This daily contrast corresponds to the well-known variations in the equatorial electrojet. It’s strong evidence that the equatorial electrojet is amplifying the geomagnetically induced current activity during interplanetary shock arrivals in a way that hasn’t really been recognized until now.

Effects on equatorial power grids

This result has significant implications for the many countries located beneath the equatorial electrojet that may be operating power infrastructure not initially designed to cope with space weather. These countries need to look into ways of protecting their infrastructure during geomagnetically quiet periods as well as during severe geomagnetic storms.

One of our coauthors, Dr Endawoke Yizengaw from Boston College, grew up in Ethiopia, within the equatorial electrojet’s region of influence. He recalls regular unexplained power outages during his childhood and wonders whether interplanetary shocks may have played a role. We hope to be able to answer this question in the near future.

Scientists around the world are conducting ongoing research to better understand the effects of these geomagnetically induced currents on power grids. It’s becoming increasingly clear that we need to investigate the effects of quiet periods, not just major events. What happens during these quiet times, and in regions often overlooked, can have a significant impact on our increasingly technology-dependent society.

This article originally appeared on The ConversationThe Conversation

Read next: How Humans Used Up a Year of Natural Resources In Under 9 Months

Download TIME’s mobile app for iOS to have your world explained wherever you go

TIME Ideas hosts the world's leading voices, providing commentary and expertise on the most compelling events in news, society, and culture. We welcome outside contributions. To submit a piece, email ideas@time.com.

TIME Science

What It’s Like To Work in the Body Donation Industry

Zocalo Public Square is a not-for-profit Ideas Exchange that blends live events and humanities journalism.

I had to ask next-of-kin uncomfortable questions about the deceased

For over three years, I thought about death every day. This wasn’t some morbid obsession. It was my job.

A growing number of senior citizens—both permanent residents and part-time “snowbirds”—have settled in neighborhoods and mobile home parks across Phoenix, Arizona. All of these out-of-state transplants and seniors have created a growing demand for alternatives to traditional funerals.

When I was 23, I started working for one of several whole-body donation organizations that serve the region. When someone dies, his or her family has several options for the body: a viewing and burial, cremation, embalming, or donation. Donating tissue, like organs, corneas, and skin, can also take place.

My time in the whole body donation industry began on Craigslist. The help wanted ad didn’t list the organization’s name. When my future boss called to set up an interview, I thought she was a telemarketer. I could’ve hung up then, missed the opportunity, and been none the wiser about the death industry.

Instead, I stayed on the phone, and she ended up hiring me for the receptionist position.

I answered phones, typed up letters, and ran to the post office. It was a normal office job, except for the cadavers less than 100 feet from my desk, sealed away in the laboratory.

Soon I was promoted and began taking “notification of death” calls. Some people signed up to donate their bodies to science. Family members or a hospice nurse called to inform me of the donor’s passing, and I arranged for mortuary transportation. Other times, people called in dazed. Their father or mother or sister or spouse had died, they told me, and they didn’t know what to do.

I couldn’t do much. I couldn’t undo their loved one’s death or say something wise to make everything better. But I could give them a to-do list: call this person, sign this, fax this, and answer these questions.

It was an industry I hardly knew existed until I was part of it. Soon, I started picking up on references to donation on television shows like Bones and Law & Order: SVU. The storylines usually involved a nefarious character in a suit, stolen body parts, or a funeral home with a hidden autopsy suite.

These plotlines are not baseless. The reputation of the body- and tissue-donation industry as a whole has been damaged by the actions of dishonest hospitals, funeral homes, and doctors. Less than a decade ago, the CEO of a tissue recovery firm was arrested for selling illegally harvested body parts. Funeral homes, donation companies, and hospitals have also been exposed for forging consent forms and unlawfully obtaining tissue. In 2013, I watched as FBI vans and news helicopters swarmed a nearby tissue donation firm. Our phones began ringing off the hook. I assured panicked callers that no, that was not our organization on the television, and that yes, we could help with arrangements for their loved ones. I feel lucky to have worked for an ethical business in such a loosely regulated industry.

My supervisor required us to inform people fully of the nature of full body donation, answer all questions, and only proceed with witnessed, written consent. Not all religions and communities support donation. I made it clear that people should only donate if they were 100 percent comfortable with the process.

Whole body or anatomical donation places organs, body parts, and other tissues with medical facilities. The tissue is then used for training doctors, developing new treatments and medications, and researching diseases, from breast cancer to dementia. What is not used for research is cremated and returned to the family.

Death is expensive. Traditional funerals cost upwards of $6,000 and even simple services can force families to choose between paying rent or paying for a cremation. The organization where I worked covered the cost of mortuary transportation, cremation, and the return of ashes to the family. We took care of all the necessary paperwork and tried to whittle down any other costs. Usually families were just left with paying the county for a death certificate, which are about $20 apiece.

Some people decided to donate because of financial hardship. For most people, the decision to donate was not a financial choice. I saw people who had suffered for years from cancer sign up for donation because they thought it was one final way to fight the disease.

Sometimes, especially when the person died in hospice, a nurse or social worker would step up and help the family plan for a funeral home or alternative service. Other times, it was a family member who called. They found us in a Google search or were referred by a friend and were fumbling through a bewildering situation.

If the deceased was registered to donate, I started the transportation and donation process immediately. If they were not, I ran through a list of questions. Depending on the answers, I either carefully told the next-of-kin that their loved one did not qualify for whole body donation or informed them that they were accepted. Hepatitis C or HIV/AIDS rules out donation of any kind.

The majority of my workday consisted of filling out medical questionnaires. I called the family sometimes within hours of the passing and asked questions ranging from routine medical questions to intimate social history. As an introverted child and teen, I was too shy to call in a pizza delivery order. In this job, I was on the phone calmly inquiring about drug use, tattoos, and sexually transmitted diseases.

With the phone cradled on my shoulder, I frantically typed and Googled unfamiliar medical conditions. I memorized the correct spellings of aneurysm, Levothyroxine and myelodysplastic syndrome.

The most surreal part of the job was when the donation was done, and I personally delivered the ashes back to the family. I logged thousands of miles on my VW station wagon. The other coordinator and I drove so much the owners eventually bought a company car that was much nicer than either of our vehicles. I traveled to every corner of the valley to multi-million dollar homes, gated suburbs, and rusted trailer parks.

Some days were good. The families were in mourning but thanked me profusely. I could tell they were content with their decision. People invited me into their living rooms. They showed me photos of their loved ones and detailed their plans to spread the ashes on the beach or in the forest or in the Colorado River.

Other days, not so much. I visited hoarders with houses so crammed with card tables, boxes, and cat food that I had to come in through the garage. I delivered to homes in the empty stretches of Apache Junction that gave off a distinct meth house vibe, complete with cardboard jammed over the windowpanes and a television set smashed on the lawn. After delivering the ashes of a 20-year-old to his grieving mother, I sat in the car for 20 minutes and tried to shake off a wave of sadness.

People expressed shock to see me, a person in her 20s, on their doorstep, holding their loved one’s ashes. They remarked on how young I was. (I think they expected a gaunt-faced, middle-aged man in a gloomy suit.)

“How’d you end up in with this kind of job?” they’d ask.

The death industry is not an easy business to work in. You carry the sadness and anger of your work day home with you. Sometimes you have nightmares. After reading too many medical examiner’s reports, you create a mental list of ways you do not want to die.

I still didn’t come any closer to understanding death. I couldn’t ever define what I wanted after my death, but I realized that talking about it was the best way to prepare for it. Ignoring death just leaves empty spaces and gaps for the survivors to guess their way through.

One thing that struck me was how the reports always describe the state of cleanliness of a residence where someone dies. Now I find myself making a point of keeping my home neat, because you never know when death might visit. And I’d like to avoid the judgment of the medical examiner.

My time in the industry gave me endless anecdotes and cautionary tales. With my boss’s encouragement, I re-enrolled in school and accepted a journalism internship. I left the job after three years, but felt a decade older. I knew my time in the death industry was over.

Whitney M. Woodworth, an Arizona State University Cronkite Sustainability Fellow at Zócalo Public Square, is a senior at ASU

TIME Ideas hosts the world's leading voices, providing commentary and expertise on the most compelling events in news, society, and culture. We welcome outside contributions. To submit a piece, email ideas@time.com.

Your browser is out of date. Please update your browser at http://update.microsoft.com