Saturday, December 7, 2019

Often we say we are Bayesian reasoners; people instead reason in a digital manner, assuming that uncertain information is either true or false when using that information to make further inferences

Johnson, S. G. B., Merchant, T., & Keil, F. C. (2019). Belief digitization: Do we treat uncertainty as probabilities or as bits? Journal of Experimental Psychology: General. Dec 2019. https://doi.org/10.1037/xge0000720

Abstract: Humans are often characterized as Bayesian reasoners. Here, we question the core Bayesian assumption that probabilities reflect degrees of belief. Across eight studies, we find that people instead reason in a digital manner, assuming that uncertain information is either true or false when using that information to make further inferences. Participants learned about 2 hypotheses, both consistent with some information but one more plausible than the other. Although people explicitly acknowledged that the less-plausible hypothesis had positive probability, they ignored this hypothesis when using the hypotheses to make predictions. This was true across several ways of manipulating plausibility (simplicity, evidence fit, explicit probabilities) and a diverse array of task variations. Taken together, the evidence suggests that digitization occurs in prediction because it circumvents processing bottlenecks surrounding people’s ability to simulate outcomes in hypothetical worlds. These findings have implications for philosophy of science and for the organization of the mind.

General Discussion

Do beliefs come in degrees? Here, we showed that they do not when we use those beliefs to make
further predictions—in such cases, probabilities are converted from an ‘analog’ to a ‘digital’ format
and are treated as either true or false. Compared to Bayesian norms, participants across our studies
consistently underweighted low-probability relative to high-probability hypotheses, often ignoring
low-probability events completely. This neglect challenges theories of cognition that posit a central
role to graded probabilistic reasoning. Here, we discuss where this tendency appears to come from
and in what ways it might be limited.

Predictions from Uncertain Beliefs
Many studies have found that when an object’s category is uncertain, people rely on the single
most-probable category when predicting its other features. Although some studies find individual
differences and variability among tasks, single-category use has held up among many different kinds
of categorization schemes (e.g., Johnson, Kim, & Keil, 2016; Lagnado & Shanks, 2003; Malt, Murphy, & Ross, 1995; Murphy & Ross, 1994, 1999).
Plausibly, these limitations on probabilistic reasoning are specific to category-based induction
tasks. The purpose of categories, after all, is to simplify the world and carve it into discrete chunks.
But another possibility is that these previous findings are due to a much broader tendency in our
reasoning about uncertain hypotheses and their implications. A categorization of an object is a
hypothesis about what kind of object it is, but similarly a causal explanation is a hypothesis about what led something to happen and a mental-state inference is a hypothesis about what someone is thinking. The current studies find that people only think in terms of one hypothesis at a time in a causal reasoning task, suggesting that such digital thinking is a broad feature of hypothetical thinking. This is consistent with the singularity hypothesis (Evans, 2007), according to which people entertain only a single possibility at a time—an idea with broad explanatory power in higher-level cognition.
Why does digitization occur when making predictions from uncertain beliefs? Such predictions
typically require three processes. First, potential hypotheses must be evaluated, given the available
evidence, resulting in estimates of the hypothesis probabilities P(A) and P(B) (abduction). Second, the prediction needs to be made conditionally on each hypothesis holding, that is, in each relevant possible world, resulting in estimates of the predictive probabilities P(Z|A) and P(Z|B) (simulation). Finally, these conditional predictions need to be weighted by the plausibility of each hypothesis (integration), leading to an estimate of P(Z). Although people are able to perform each of these processes, they each are accompanied by limitations and bias. How do each of these stages contribute to digitization? Our experiments are most consistent with a model in which abduction leads to more extreme explicit hypothesis probabilities, simulation capacity limits result in digitization, and integration leads people to under-use hypothesis probabilities relative to predictive probabilities. This conclusion is necessarily provisional at this early stage, but here we lay out the best case made by the evidence.
The abduction phase—deciding among potential hypotheses as the best explanation for the
data—relies on a variety of heuristics. Although many of these heuristics may adaptively help to
circumvent computational limits or even lead to more accurate inferences, these heuristics lead to
systematic biases relative to Bayesian norms. Most relevant, people assign a higher probability to a
hypothesis that outperforms its competitors, relative to what is implied by objective probabilities
(Douven & Schupbach, 2015; see also Lipton, 2004). This sort of process could plausibly give rise to
digitization. Moreover, explanation often leads to overgeneralization in the face of exceptions
(Williams et al., 2013), consistent with the idea that abduction tends to underweight or ignore lower-
probability hypotheses. But in our studies, abduction does not seem to be a necessary ingredient for
digitization, since digitization even occurs when hypothesis probabilities P(A) and P(B) are provided
explicitly, avoiding the need for abductive processing (Studies 5 and 8A). The most likely resolution
of this puzzle is that abduction leads us to explicitly assign more extreme probabilities to hypotheses,
relative to Bayesian norms, but not to ignore those less-likely hypotheses altogether.
The simulation phase—imagining the plausibility of the prediction in the possible worlds defined
by each hypothesis—is known to have sharp capacity limits (Hegarty, 2004). Indeed, even within a
simulation of a single causal system, people imagine each step in that system piecemeal. Thus, it seems unlikely that people can simultaneously simulate multiple possible worlds and store their outputs simultaneously. Consistent with the idea that this is the key processing bottleneck that produces digitization, people do consider multiple possibilities when the predictive probabilities P(Z|A) and P(Z|B) are given explicitly, avoiding the need to simulate these outcomes (Studies 8B and 8C).
Yet, this does not seem to be the whole story. The integration phase—putting together multiple
pieces of evidence and weighing each by their diagnosticity—is also subject to biases. In particular,
people tend to over-rely on information about evidence strength (e.g., the proportion of cases consistent with a hypothesis) relative to information about evidence weight (e.g., sample size) (Griffin & Tversky, 1992; Kvam & Pleskac, 2016). Although this bias should not be extreme enough to lead people to
ignore lower-probability hypotheses, it could result in overconfidence—overly extreme probabilities—if people treat predictive probabilities as strength information (how likely the prediction is within each possible world) and hypothesis probabilities as weight information (how much to consider each
possible world). This pattern seems to be consistent with the data. Even when both the hypothesis
and predictive probabilities are given explicitly, requiring only integration to occur, participants overrely on the high-probability relative to low-probability hypothesis (Study 8C).
Thus, all three processing steps appear to contribute to overly extreme probability judgments,
albeit in different ways. Abduction may result in explicit probabilities that are too extreme, relative to
Bayesian norms. Integration seems to result in under-responsiveness to hypothesis probabilities. And
simulation seems to lead people to ignore lower-probability hypotheses entirely.
If digitization can lead to systematic errors, relative to Bayesian norms, why might the mind use
this principle? Digitization is often necessary to avoid a combinatorial explosion (Bobrow, 2012;
Friedman & Lockwood, 2016). Suppose you are unsure whether the Fed will raise interest rates.
Depending on this decision, Congress may attempt fiscal stimulus; depending on Congress’s decision, the CEO of Citigroup may decrease capital reserves; and depending on the CEO’s decision, SEC regulators may tighten enforcement of certain rules. Integrating across such chains of possibilities becomes daunting even for a computer as the number of branches increases. As recently as the 1990s, chess-playing computers used brute force methods to search through trees of possible moves, and even the famous Deep Blue, despite its massive processing power, did not consistently defeat the best human players, such as Garry Kasparov (Deep Blue lost 2.5 out of 6 games in their final match). The computationally efficient way to approach such a problem is precisely the opposite of brute force—to construct plausible scenarios and ignore the rest. Human chess players had, and probably still have, far better heuristics for pruning this huge space of possibilities. Our participants’ error was using this strategy even when the normative calculation is straightforward. This strategy may be adaptive in other contexts. Indeed, when the most-likely hypothesis has a probability close to 100%, it may even be areasonable approximation to the Bayesian solution.
What, then, should we make of probabilistic theories of cognition (Gershman et al., 2015;
Tenenbaum et al., 2011)? People clearly can represent analog probabilities at some level (“a 70%
chance of rain”) but our results show that they cannot use these probabilities to make downstream
predictions, instead digitizing them. Because probabilistic models typically characterize the output of
reasoning processes rather than the underlying mechanisms, they can be of great value in
characterizing the problems that our minds solve. But to the extent that such theories make
mechanistic claims involving the processing of analog probabilities within complex computations—
even at an implicit level—simpler, heuristic mechanisms may better account for human successes,
such as they are, with uncertainty. We look forward to the possibility that computational approaches
to the kinds of tasks we model in this paper can help to shed further insight on the underlying cognitive processing.

No comments:

Post a Comment