Forecasting: How and why social sciences can attempt to predict the future

The social sciences tend to receive a lot of criticism from natural scientists. Significant progress has been made in the past few centuries in physics, medicine, and chemistry, while there have been comparatively fewer concrete wins for our understanding of sociology and political theory. Psychology is suffering from an infamous replication crisis, and still we don’t know of the definitively correct theory for how to make a developing country rich.

This is in part because controlled experiments, a key part of the scientific process, are far more difficult to run in the social sciences than they are in the natural sciences. In chemistry, you can easily combine different chemicals in a lab and observe how they interact with each other. In the social sciences, not only is it sometimes impossible to run experiments in controlled lab settings like this, but since you are dealing with real people, you often run into ethical problems. You ethically can’t intentionally expose children to high levels of stress at home to see how it affects their development, and you logistically can’t randomly assign a different theory of economic development to developing countries and compare results.

For this reason, social scientists often have to be a bit more creative about how to test their theories and produce evidence for or against them. One tool that is especially useful to social scientists for this end is forecasting, which is making falsifiable predictions about future events. Forecasting can be used to produce evidence for or against a theory by asking, for instance, “if a given theory of international relations were true, what would that predict for the Russia-Ukraine conflict?” If a theory repeatedly makes predictions that turn out to be false, this provides evidence against it, while if it repeatedly makes predictions that turn out to be true, this provides evidence in favor.

Forecasting as a method for falsification is not only useful for testing empirical academic theories—it can be used to test empirical beliefs more generally. For instance, it can be used to test the accuracy of the commentary of different pundits—if a certain pundit’s narrative about a geopolitical event is accurate, what would that predict? If a public figure makes falsifiable claims about the world, as opposed to vague statements that cannot eventually be resolved as objectively true or false, this allows us to quantifiably measure their credibility. If they have a long track record of being right in their statements about the future, this gives us reason to trust their other empirical claims, while if they are consistently wrong, this suggests we should pay them less attention.

But perhaps more importantly, beyond its ability to test empirical beliefs, prediction is often the end goal of social science theories and punditry in and of itself. Questions like how many Covid-19 cases there will be six months from now, how much money an investment will make, who will win the next presidential election, what the inflation rate will be next year, whether a given marriage will end in divorce, or how many degrees warmer the Earth will be in three decades are extremely useful to know in advance for their decision-making implications. 

Forecasting social phenomena like these is harder than forecasting physical phenomena due to a variety of reasons, such as the relevance of more variables and larger uncertainties as well as the availability of less data. Still, forecasting has been one of the explicit objectives of social science for about two centuries, and meaningful progress has started to be made in the past decade or so. 

In 2011, an American intelligence organization known as the Intelligence Advanced Research Projects Activity (IARPA) ran a competition called the Aggregative Contingent Estimation program to identify the best methods to make geopolitical forecasts. An organization called the Good Judgment Project entered as a contestant, with the simple strategy of “harnessing the wisdom of the crowd to forecast world events”. The most talented forecasters in GJP (defined as the top 2% and dubbed “superforecasters”) were all ordinary citizens, and were “reportedly 30% better than intelligence officers with access to actual classified information.” In fact, the average expert did about as well as a “dart-throwing chimp”, in the words of Professor Philip Tetlock, who wrote at length about the competition in his book Superforecasting: The Art and Science of Prediction.

Tetlock explains that this difference in performance is explained by the fact that the advantage intelligence insiders obtained from having access to additional information was outweighed by their lack of the intellectual aptitudes the top GJP forecasters possessed, such as intellectual humility, open-mindedness, and ideological agnosticism. 

GJP went on to find that forecasting is a skill that you can significantly improve on with practice. By making falsifiable predictions and keeping track of your reasoning and performance, you can learn what parts of your worldview seem to be incorrect, get better at forecasting techniques, and become more accurate. These include skills like: 

  • Being well-calibrated, meaning that out of all the predictions you make with a confidence level of 90%, 90% of them turn out to be true, while for the predictions you make with a 10% confidence level, 10% turn out to be true, etc; 
  • Making Fermi estimates, which involve decomposing a hard-to-estimate variable into smaller, more easily estimated steps; 
  • Identifying relevant base rates, which are the rate at which a class of event has historically resulted in a given outcome; and
  • Balancing inside and outside views, which are views based on the details of a specific event versus views based on base rates.

Yet despite its learnability and obvious value, pundits, experts, and thought leaders (with a few exceptions) have yet to adopt forecasting track records in order to prove that they have credibility. This is largely because they have a conflict of interest—people with an established reputation in a field don’t have an incentive to put themselves through processes that allow them to be proven wrong and damage their credibility.

Therefore, this type of forecasting tends to be restricted to online communities of enthusiasts. There are several websites where you can both view forecasts and make your own, including the Good Judgment Project’s own website, Metaculus, CSET Foretell, or Manifold Markets. Forecasts span every category, from elections to timelines for technological breakthroughs to economic metrics to Covid-19 cases, and you can submit your own questions for people to make forecasts on. While they are open to anyone, the weighted community prediction on these websites tends to be very well calibrated and as accurate as a superforecaster. 

These forecasting communities are still relatively small, but have nevertheless seen some significant successes. Perhaps most notably, they were significantly quicker to predict the Covid-19 pandemic than most experts in the media at the time, and have maintained a track record of strikingly accurate forecasts throughout the past two years. Time Magazine has also reported that “Open Judgment’s superforecaster team has a track record of success, having made accurate predictions about world events like the approval of the United Kingdom’s Brexit vote in 2020, Saudi Arabia’s decision to partially take its national gas company public in 2019, and the status of Russia’s food embargo against some European countries also in 2019.” Yale News reported that “a majority of Metaculus users correctly predicted that the LIGO research team would announce the discovery of gravitational waves by the end of March; a majority also correctly predicted that an Artificial Intelligence player would beat a professionally ranked human player at the ancient game of Go, and that physicists at the Large Hadron Collider would not discover a new, subatomic boson particle.”

As the forecasting community grows and receives more attention, we will hopefully see more applications of these accurate predictions in health, policy-making, and other areas. Forecasting can help us decide who to listen to, plan for the future, and make better decisions, and it becoming mainstream would make us a wiser and more epistemically sound society. If you are as excited for this to become a reality as I am, you can follow the story by keenly tracking the community’s forecasts of its own growth over the coming years.

Leave a Reply

Your email address will not be published. Required fields are marked *