Like the before-tremors of a massive earthquake, there has been a crisis brewing in the world of academic psychology, known as the "replication crisis." It's now made The New York Times.
Put simply, the findings in many — maybe most — experiments done in academic psychology and published in peer-reviewed journals cannot be replicated. In other words, if you run the experiment twice, you get a different result.
That, to put it lightly, is a problem.
The reason we know gravity exists is because if you run the experiment of dropping a pin to the ground, you will get an identical result every single time. This is what the entire scientific revolution was built on: Through controlled experiments, you can derive reliable scientific findings. And the way you know those findings are reliable is if the experiment can be replicated. If, when you dropped a pin, half the time it would fall to the ground, and half the time it would fly out in a random direction, we would quite literally not know the law of gravity.
That word "controlled" in "controlled experiment" is the key one, the clincher. The reason why the scientific method produces reliable findings is because experiments are set up to isolate all possible causes from an effect, save one. Thus, when you run an experiment, you know which cause leads to which effect.
We sometimes hear the myth of how Galileo disproved Aristotle's theory that heavier objects fall faster than lighter ones by dropping two balls from the tower of Pisa. But in reality he built elaborate contraptions to run his experiment, to account for, for example, air resistance. If he'd simply dropped two objects from the Tower of Pisa, anybody could have legitimately concluded that air resistance, or the wind, or any other factor, could have led to the result. It's the fact that the experiment was both controlled and replicable that made his finding reliable, and allowed him in a short time to overturn a 2,000-year-old consensus.
This is what scientists call "omitted variable bias." If your experiment doesn't account for all potential variables, then there is always the possibility that a variable other than the one you're trying to measure is affecting your result. And that means your result is not viable.
Which leads to an obvious problem: The more potential variables you have, the harder it is to be sure you've isolated all of them. This is known as the problem of causal density. The more potential causes there are, the harder it is to isolate any one of them.
And it turns out that within the Universe, the thing — as far as we know — with the highest causal density is human beings. We know a lot — a lot — more about the Big Bang, billions of years ago, than we do about what makes us happy or sad. (Think about that for a second.) Or, for that matter, how interest rate policy maximizes growth and employment, or how to effectively police a neighborhood.
Let's say, as an example, you have a psychology experiment trying to show, say, that when people are happy they learn better. You might have a group of people who are given something to memorize, then are shown happy pictures, then have to memorize it. And you have a control group of people who aren't shown the happy pictures. And let's say the result is positive (i.e. statistically significant according to some more-or-less arbitrary criteria that the profession of academic psychologists has decided upon).
Well, you can get it published in a peer-reviewed journal. But have you actually proven anything?
You've proved that 20 undergraduates who were bribed with beer money, on such and such a date and in such and such location, could memorize stuff better when they were shown happy pictures. Does that actually prove that, for all human psychology, we learn better when we're happy? Well, if you can get the experiment replicated many times, it might.
But the causal density in human beings is so extreme, that you can't get it replicated, and so, in reality, it doesn't.
What should we do about all this?
First, we should get more humility. Just because someone has tenure and occasionally wears a lab coat doesn't make them a magician. We all too often think of "science" as something akin to "magic," when it is not. Whether it's an economist or a psychologist, we should realize that what they're doing is, by and large, not science, in the sense that physics is a science.
Second, we should do more experiments not less. Does what I've said mean it's worthless to do psychology experiments? No, actually, it means the opposite. The only way we're going to get replicable experimental results is if we do lots and lots and lots of different experiments, in lots and lots of different contexts.
Third, related to the first and second, is that we should have a different approach to the human sciences, and realize that they are more like a "science" in the ancient sense — a form of wisdom, which is what the Latin word sapientia, where we get "science" from, means — than they are a science in the experimental sense. We should of course take replicated experimental findings seriously (if with a grain of salt) and we should, again, try to get a lot more of them. But we should also be realistic and realize that, at least for the foreseeable future, Shakespeare and Plato will probably teach us as much about human psychology as textbooks will.
We need a new word, and a new approach, for those disciplines of the human "sciences" where the scientific method is only going to get us part of the way there. For those approaches, we should steer clear of two extremes: a science cargo cult, where we think if we only imitate the moves of physicists we will understand everything, and an anti-intellectualism that rejects all experimental findings. A reliable picture of the human person is more like a jigsaw puzzle. Some parts will come from scientific experimentation, but some parts will also come from the wisdom of the ages, such as philosophy and literature, and we need to fit the pieces together as we get them, and as they change.
How, exactly do we do that? As yet, I have no idea. But what concerns me is that so few people are even trying.