Sunday, June 30, 2019

What Suboptimal Choice Tells Us About the Control of Behavior

What Suboptimal Choice Tells Us About the Control of Behavior. Thomas R. Zentall. Comparative Cognition & Behavior Reviews, Volume 14: pp. 1–18.

Abstract: When animals make decisions that are suboptimal, it helps us to identify the processes that have evolved to produce this behavior. In an earlier article, I discussed three examples of suboptimal choice or bias (Zentall, 2016): (a) sunk cost, the tendency to continue on a losing project because of the amount already invested; (b) unskilled gambling, in which the loss is greater than the return; and (c) justification of effort, the bias to prefer conditioned stimuli that in training required more effort to obtain. Here I discuss three additional examples of suboptimal choice that we have studied in animals: (a) when less is better, in which animals prefer one piece of food (one preferred item) over two pieces of food (one preferred item plus one less preferred item); (b) suboptimal choice on the ephemeral choice task, in which animals prefer one piece of food now over two pieces of the same food, one now but the second briefly delayed; and (c) suboptimal choice in the midsession reversal task, errors of anticipation and perseveration. Each of these examples may help to identify the relative limits on behavioral flexibility found when animals are exposed to conditions that may be different from those that they would normally encounter in their natural environment. They also may help us to understand the origins of similar behavior when it occurs in humans.

Keywords: suboptimal choice, less is better, ephemeral reward, midsession reversal

Those of us who study the behavior of animals assume that they have evolved to maximize their success (e.g., at finding food), and much of learning theory (Skinner, 1938; Thorndike, 1911) is based on this premise. Animals select those responses that lead to the increased probability of reinforcement over those that do not. When animals’ behavior is consistent with this theory, it strengthens our belief in the validity of the theory. However, when animals show a preference for alternatives that result in less food over those that result in more food, it is important to try to understand why they do.

Kacelnik (2006) suggested that rationality in decision making can be defined in different ways. When defined by philosophers and psychologists, it has been judged in terms of the reasoning or thought processes that accompany the decisions. When defined by economists, it does not require thought processes but refers to behavior that is internally consistent and is compatible with expected utility maximization. When defined by biologists, it is broader and goes beyond the organism to allow for inclusive fitness (including benefit to one’s kin).

Sometimes, what appears to be an irrational choice may reflect a change in state. An animal’s preference for one kind of food over another may reverse if it has been sated on the preferred food, or an animal that has a choice between eating and being with conspecifics may choose the latter because being close to others may enhance feeding rate or may offer safety from predation (Kacelnik, 2006). Alternatively, the condition that the animal is in may cause it to choose less food over more food. For example, an animal may choose a low probability but possibly larger amount of food over a frequent but smaller amount of food but one that will not allow it to survive through the night (Stephens, 1981; see also Houston, McNamara, & Steer, 2007).

When animals prefer an alternative that provides them with less food over one that provides them with more (i.e., they choose suboptimally), on one hand, it may cause us to question the processes that underlie that behavior. In an earlier article in this journal (Zentall, 2016), I described a task in which pigeon showed a strong preference for one alternative that on 20% of the trials provided them with a signal for reinforcement and on 80% of the trials provided them with a signal for the absence of reinforcement, over a second alternative that always provided them with a signal for 50% reinforcement (Stagner & Zentall, 2010). With this procedure, not only do pigeons quickly show a preference for the 20% signaled over the 50% unsignaled reinforcement but they show no evidence that they learn to correct that preference with extensive training. Furthermore, that preference is not simply controlled by the uncertainty of the reinforcement associated with the higher probability of reinforcement alternative, because even when the alternatives are between 50% signaled reinforcement and 100% reinforcement, pigeons do not show a preference for the optimal alternative (McDevitt, Dunn, Spetch, & Ludvig, 2016; Smith & Zentall, 2016). In addition, a similar pattern of suboptimal choice can be shown when reinforcement magnitude is manipulated. For example, pigeons prefer a 20% chance of obtaining a signal for 10 pellets of food over a 100% chance of obtaining a signal for three pellets of food (Zentall & Stagner, 2011).

When animals choose suboptimally, it may tell us something about the natural environment in which the animals have evolved (Fortes, Pinto, Machado, & Vasconcelos, 2018). Several mechanisms may be responsible for this suboptimal choice. First, in nature, when an animal approaches a stimulus that signals the presence of food, it is likely that the probability of reinforcement will increase. Not so in this choice task in which choice frequency has no effect on the probability of reinforcement. Second, in nature, when an animal encounters a signal for the absence of food, that signal can generally be ignored, because the animal will simply reject it and look elsewhere for food (Fortes et al., 2018; Vasconcelos, Machado, & Pandeirada, 2018). That is, in nature there is no need to remain in its presence, so it does not acquire inhibitory value, whereas the animal must remain in its presence in the laboratory choice experiment.

Although the predictive value of the conditioned reinforcer that follows choice of each alternative, independent of its probability of occurrence, appears to predict choice (Smith & Zentall, 2016), evidence suggests that there may be a third factor (Case & Zentall, 2018; McDevitt et al., 2016). Case and Zentall (2018) found that when pigeons are given a choice between 50% signaled reinforcement and 100% reinforcement, they initially show indifference between the two alternatives; however, with continued training they show a significant preference for the suboptimal alternative (see also Kendall, 1974). Case and Zentall suggested that the preference for the suboptimal alternative may result from positive contrast between the expected value of reinforcement following choice of the suboptimal alternative and the value of the conditioned reinforcer that follows on half of the trials. Positive contrast would not be expected between choice of the optimal alternative and the conditioned reinforcer that follows, because the expected value of reinforcement is consistent with the value of reinforcement that follows. A similar mechanism was suggested by McDevitt et al. (2016), who proposed that the conditioned reinforcement that followed choice of the suboptimal alternative represented “good news,” whereas the conditioned reinforcement that followed choice of the optimal alternative was not newsworthy. Although identifying the predispositions responsible for suboptimal choice with this procedure will likely require further research, the inability of the pigeons to learn to choose optimally suggests that there are conditions under which pigeons do not appear to have the flexibility to overcome these predispositions.

In the earlier article (Zentall, 2016), I identified two other cases in which pigeons fail to choose optimally. The first was research on the sunk cost effect in which pigeons prefer to complete pecking on one reinforcement schedule over changing to another reinforcement schedule, even though changing to the other schedule would have reduced the time and effort (amount of pecking) to reinforcement. For example, pigeons first learned to peck 30 times for food when the color was green and 10 times for food when the color was red. They then learned that after pecking green a variable number of times, they would be given a choice between completing the pecks to green and switching to peck the red 10 times. Surprisingly, the pigeons preferred to return to pecking green, even when returning to green required as many as 25 more pecks (Pattison, Zentall, & Watanabe, 2012; see also Magalhães & White, 2014; Navarro & Fantino, 2005).

The second additional line of research described in the Zentall (2016) article actually involved a bias rather than a suboptimality. Pigeons were trained to peck a light to receive a choice between two colors. On some trials, a single peck was required and the choice was between, for example, red and yellow and choice of red was reinforced. On other trials, 20 pecks were required and the choice was between, for example, green and blue and choice of green was reinforced. On probe trials, pigeons were given a choice between red and green, the two colors both associated with reinforcement. Surprisingly, the pigeons showed a preference for green, the color that during training they had to work harder to obtain. When a similar effect has been found in humans (e.g., Aronson & Mills, 1959), it has been referred to as the justification of effort effect; however, we prefer to interpret this preference as a contrast effect. That is, the positive contrast between 20 pecks and green was greater than the positive contrast between one peck and red.

In the present article I examine three additional phenomena, each of which demonstrates a behavior that is suboptimal. The first is commonly referred to as the less is better effect; the second is the failure to learn to choose optimally on a task in which choice of one alternative provides two reinforcements, whereas the other provides only one (the ephemeral reward task); and the third is the failure to choose optimally on the midsession reversal task.

The Less Is Better Effect

Economists have traditionally held that when humans are given sufficient information, they generally make rational choices (Persky, 1995). This is the basis of rational choice theory (Becker, 1976). However, Tversky and Kahneman (1974) challenged this notion by showing that humans tend to use various affective heuristics in making decisions and those heuristics can be shown to lead to suboptimal decisions. Such an example is the less is better effect (sometimes referred to as the less is more effect), demonstrated in several experiments by Hsee (1998). In one example, Hsee asked subjects to estimate the value of a set of 24 dishes, all in good condition, or to estimate the value of a set of 40 dishes, but only 31 were in good condition. Surprisingly, the set of 24 dishes was valued higher than the set of 40 dishes. Apparently, the nine dishes of poor quality depreciated the value of the 31 good-quality dishes. The average quality of the set, as a whole, apparently overshadowed the objective judgment of the value of the set. But this effect may be unique to humans, who may be sensitive to the aesthetics of the two sets of dishes.

In another study, subjects were asked to imagine that a friend had given them a $55 wool coat from a store where coats cost between $50 and $500, or alternatively a $45 wool scarf from a store where scarves cost between $5 and $50 (Hsee, 1998). The subjects said that they would be happier with the scarf than with the coat because the purchase of the scarf would reflect greater generosity than the purchase of the coat. The scarf was at the high end of the range, whereas the coat was at the low end of the range. This finding suggests that if gift givers want their gift recipients to perceive them as generous, it would be better for them to give a high-value item from a low-value product category (e.g., a $45 scarf) than a low-value item from a high-value product category (e.g., a $55 coat).
Would animals show the same bias if food of different quality was used rather than dishes or clothing? According to optimal foraging theory (Stephens & Krebs, 1986), other factors being equal (e.g., the possibility of predation), nature should select against any tendency to prefer an alternative that provides less food. Kralik, Xu, Knight, Khan, and Levine (2012) tested this hypothesis. They found that monkeys readily would eat grapes and sliced cucumbers, but when offered a choice between them, they preferred the grapes. When the monkeys were offered a choice between a grape by itself or a grape and a slice of cucumber, however, they generally showed a strong preference for the grape alone.

A similar effect was found by Beran, Ratliff, and Evans (2009) for two of four chimpanzees when given a choice between a slice of banana and a similar slice of banana plus a slice of apple. Similarly, chimpanzees were indifferent between a preferred pellet and a similar pellet plus either a less preferred piece of carrot or a less preferred piece of apple (Sanchez-Amaro, Pereto, & Call, 2016). And when Beran, Evans, and Ratliff (2009) manipulated the quantity rather than the quality of the combined option, four chimpanzees preferred a 20 g slice of banana over the same 20 g slice of banana plus an additional 5 g slice of banana.

Dogs, too, have been found to show a less is better effect (Pattison & Zentall, 2014). Several dogs were found to eat a slice of carrot or a slice of cheese, but when given a choice, they preferred the cheese. However, when given a choice between the cheese and a combination of the cheese and the carrot, these dogs preferred the cheese alone (see Figure 1).

[full paper and figures in the link above]

No comments:

Post a Comment