Wednesday, September 14, 2022

Recognition of Masked Faces in the Era of the Pandemic: No Improvement Despite Extensive Natural Exposure

Recognition of Masked Faces in the Era of the Pandemic: No Improvement Despite Extensive Natural Exposure. Erez Freud et al. Psychological Science, September 12, 2022. https://doi.org/10.1177/09567976221105459

Abstract: Face masks, which became prevalent across the globe during the COVID-19 pandemic, have had a negative impact on face recognition despite the availability of critical information from uncovered face parts, especially the eyes. An outstanding question is whether face-mask effects would be attenuated following extended natural exposure. This question also pertains, more generally, to face-recognition training protocols. We used the Cambridge Face Memory Test in a cross-sectional study (N = 1,732 adults) at six different time points over a 20-month period, alongside a 12-month longitudinal study (N = 208). The results of the experiments revealed persistent deficits in recognition of masked faces and no sign of improvement across time points. Additional experiments verified that the amount of individual experience with masked faces was not correlated with the mask effect. These findings provide compelling evidence that the face-processing system does not easily adapt to visual changes in face stimuli, even following prolonged real-life exposure.

Discussion

Face masks were an important tool in the effort to minimize COVID-19 virus transmission (Cheng et al., 2020). Accordingly, the years 2020 to 2022 provided an unprecedented opportunity to examine the effects of prolonged and frequent exposure to occluded faces on recognition abilities. Here, we have documented persistent quantitative and qualitative alterations in face-processing abilities for masked versus nonmasked faces, with no evidence of improvement in the processing of masked faces over time. Using a combined cross-sectional and longitudinal approach, we found that the CFMT scores for upright faces decreased by approximately 15% when masks were added to the faces. This reduction remained statistically constant across 20 months, a period of extensive exposure to masked faces. This finding suggests that the matured face-processing system did not benefit from the prolonged exposure. Additional experiments and analyses confirmed and extended this conclusion and showed that the consistent decrement in face processing of masked faces was evident even when individual differences in exposure to these faces were considered.
Another key finding is the consistent and robust reduction of the face-inversion effect for masked faces across all time points. In particular, the inversion effect was roughly 43% smaller for masked faces. The inversion effect is suggested to reflect difficulties extracting the configural relationships between face parts (Farah et al., 1995; Freire et al., 2000). Hence, the smaller inversion effect for masked faces may be taken as evidence that holistic processing is largely reduced (although not entirely abolished). This qualitative change in the processing of masked faces was consistent across time points, providing additional evidence for the rigidity of the matured face-processing system.

Why is there no improvement in masked-face recognition?

The consistent effect of masks across time points could reflect the rigidity of the matured face-processing system. In particular, face perception rapidly develops in infancy but is then subject to a prolonged developmental trajectory (Pascalis et al., 2011, 2020). In early childhood, face processing is shaped by experience with other faces (Bate et al., 2020). One of the best examples of this malleability comes from the other-race effect, which is evident early in life (Kelly et al., 2009) but could be reversed or disappear if a child is regularly exposed to other-race faces (De Heering et al., 2010; Sangrigoli et al., 2005). In contrast, in adulthood, face-processing mechanisms are already in place and are less likely to be affected by experience (Pascalis et al., 2020; White, Kemp, Jenkins, Matheson, & Burton, 2014; Yovel et al., 2012). Here, we show that even extensive, naturalistic exposure to masked faces is not sufficient to facilitate the recognition of these faces, even though the eyes region, which is disproportionally critical for face recognition (Butler et al., 2010; Caldara et al., 2005; Royer et al., 2018; Tardif et al., 2019), remains uncovered.
An additional account for the lack of improvement in recognizing masked faces relates to the nature of the interaction. One can argue that mere exposure to masked unfamiliar faces may not suffice to revamp face-processing mechanisms. However, we note that daily encounters with masked people typically include more than just passive viewing. For example, in the grocery store, a person may need to identify their neighbor or their preferred cashier. An office worker needs to recognize peers and customers. Parents who pick up their children from school interact with other parents, children, and teachers. Hence, daily experiences provide a rich arena of exposures and the need to recognize masked faces. Yet our data suggest that such naturalistic exposures and interactions might be insufficient in eliciting adaptation of the face-processing system. A more refined view is that improvement in face-processing abilities in adulthood depends on deliberate, systematic training programs and does not rely on naturalistic exposure. This view is supported by recent studies that show effects of systematic training programs that include individuation tasks (McGugin et al., 2011; Yovel et al., 2012) and ongoing feedback (White, Kemp, Jenkins, & Burton, 2014). Note, however, that even these systematic training programs bring only very moderate improvement in face recognition.
The results could also be attributed to another intriguing possible mechanism; the current situation may be part of a vicious circle, one that reduces the chances to improve. On the one hand, there is massive exposure to masked faces, which, in many cases, require effective recognition. On the other hand, however, people have the chance to meet and to encounter nonmasked people in the privacy of their homes or via electronic media. It is possible, therefore, that such a hybrid state of affairs provides the system with a convenient escape from effectively dealing with masked faces. In other words, the current situation may limit the system’s ability to adapt, even in the face of a clear need to do so. This proposed mechanism could account for the lack of improvement that we report (almost) 2 years into the pandemic. An intriguing question is for how long such lack of improvement could persist. This, of course, depends on the extent and length of the pandemic.
Finally, the observed limited malleability of the matured face-processing system raises important questions about the ability of children to improve in recognizing masked faces. A recent study reported that in school-age children, masks hinder face-processing ability to a similar or even greater extent compared with adults (Stajduhar et al., 2022). Whether children exhibit improved masked-face recognition following prolonged exposure to masked faces in everyday life remains to be determined.

Limitations

The current investigation is timely and unique and benefits from the large sample size and combination of approaches. However, there are still important limitations that should be addressed in future studies. First, although the CFMT is a reliable test that has been used extensively over the past two decades (Bobak et al., 2016; Russell et al., 2009), the faces included in this test are all Caucasian men. Given the gender effect observed in our data as well as by other groups (Bobak et al., 2016), it is important to examine the reported effects using other, more diverse tests (Scherf et al., 2017). Another concern regards the ecological validity of the CFMT. Specifically, external face cues, which are important for real-life face recognition, are not available in this test. This concern might be more detrimental in the case of masked faces. However, it is important to note that previous studies reported correlations between CFMT scores and subjective reports of face-recognition abilities (Shah, Gaule, et al., 2015), between the CFMT and other measurements of face-processing abilities (DeGutis et al., 2013; Russell et al., 2009), and, most importantly, between CFMT scores and naturalistic assessments of face-perception abilities (Balas & Saville, 2017). It is also worth noting that previous studies demonstrated the existence of the mask effect for other test and image sets, including the GFMT (Carragher & Hancock, 2020; see also the control experiment described above) and the Karolinska Directed Emotional Faces (Marini et al., 2021), in which external face cues are preserved.
The concern regarding ecological validity also applies to the absence of other cues that might facilitate person recognition, such as motion, voice, and body shape. Importantly, however, it is established that faces play a superior role in person recognition even when other cues are available (Hahn et al., 2016). This is demonstrated in cases of prosopagnosia, which is experienced in daily life even when all cues are available.
Another limitation of the current image set (as well as other image sets used in previous studies) is that the masks were added to existing pictures in an artificial manner. This might lead to an omission of face shape cues that are normally available and plausibly critical for recognizing masked faces in naturalistic settings. Although we cannot rule out the detrimental effect of the artificial mask on face perception, a recent study by Marini and colleagues (2021) demonstrated the existence of a mask effect even for transparent masks that reveal important cues from the lower part of the face. Hence, it is unlikely that the mask effect observed here, especially the lack of improvement in face perception for masked faces, is solely due to the nature of the stimuli.

Converging theoretical and empirical evidence points to suicide being a fundamentally aleatory event – that risk of suicide is opaque to useful assessment at the level of the individual

On the Randomness of Suicide: An Evolutionary, Clinical Call to Transcend Suicide Risk Assessment. C. A. Soper, Pablo Malo Ocejo and Matthew M. Large. Chp 9 in Evolutionary Psychiatry: Current Perspectives on Evolution and Mental Health. Cambridge University Press, September 8 2022. https://www.cambridge.org/core/books/abs/evolutionary-psychiatry/on-the-randomness-of-suicide/17353786992D8A3F0A7EEC7850F44F96

Summary: Converging theoretical and empirical evidence points to suicide being a fundamentally aleatory event – that risk of suicide is opaque to useful assessment at the level of the individual. This chapter presents an integrated evolutionary and clinical argument that the time has come to transcend efforts to categorise peoples’ risk of taking their own lives. A brighter future awaits mental healthcare if the behaviour’s essential non-predictability is understood and accepted. The pain-brain evolutionary theory of suicide predicts inter alia that all intellectually competent humans carry the potential for suicide, and that suicides will occur largely at random. The randomness arises because, over an evolutionary timescale, selection of adaptive defences will have sought out and exploited all operative correlates of suicide and will thus have exhausted those correlates’ predictive power. Completed suicides are therefore statistical residuals – events intrinsically devoid of informational cues by which the organism could have avoided self-destruction. Empirical evidence supports this theoretical expectation. Suicide resists useful prediction at the level of the individual. Regardless of the means by which the assessment is made, people rated ‘high risk’ seldom take their own lives, even over extended periods. Consequently, if a prevention treatment is sufficiently safe and effective to be worth allotting to the ‘high-risk’ subset of a cohort of patients, it will be just as worthwhile for the rest. Prevention measures will offer the greatest prospects for success where the aleatory nature of suicide is accepted, acknowledging that ‘fault’ for rare, near-random, self-induced death resides not within the individual but as a universal human potentiality. A realistic, evolution-informed, clinical approach is proposed that focuses on risk communication in place of risk assessment. All normally sapient humans carry a vanishingly small daily risk of taking their own lives but are very well adapted to avoiding that outcome. Almost all of us nearly always find other solutions to the stresses of living.


Tuesday, September 13, 2022

Voter bias against women cannot explain female underrepresentation in American politics; if anything, voters prefer women over men

Poutvaara, Panu; Graefe, Andreas (2022) : Do Americans Favor Female or Male Politicians? Evidence from Experimental Elections, Beiträge zur Jahrestagung des Vereins für Socialpolitik 2022: Big Data in Economics, ZBW - Leibniz Information Centre for Economics, Kiel, Hamburg. https://www.econstor.eu/bitstream/10419/264117/1/vfs-2022-pid-70505.pdf

Abstract: Women are severely underrepresented in American politics, especially among Republicans. This under-representation can arise from women being less willing to run for office, from voter bias against women, or from political structures that make it more difficult for women to compete. Here we show to what extent support for female candidates varies by voters’ party affiliation and gender. We carried out hypothetical elections in which participants made vote choices solely based on politicians’ faces. When deciding between candidates of different genders, Democrats, and particularly Democratic women, preferred female candidates, while Republicans chose female and male candidates equally often. These patterns remained when controlling for respondents’ education, age, and political knowledge and for candidates’ age, attractiveness, and perceived conservativeness. Our results suggest that voter bias against women cannot explain female under-representation. On the contrary, American voters appear ready to further narrow the gender gap in politics.

Keywords: Gender; Elections; Gender discrimination; Political candidates
JEL: D72; J16

5 Conclusion

Major gender gaps have opened in American politics in recent decades. Women are more likely than men to support Democrats (Gillion et al., 2020), Democratic voters are more likely than Republicans to support female candidates (Schwarz & Coppock, 2021), and the female share of congressional Democrats is almost three times that of congressional Republicans (Fig. 1B). We carried out hypothetical elections in 2016 and 2020 to disentangle how voter gender and partisanship interact in support for female candidates. Our results show that Democrats generally favored female candidates, and that preference for female candidates was particularly strong among Democratic women. In our 2020 survey, Democratic women chose the female candidate three times as often as the male candidate. Republican respondents, instead, chose female and male candidates about equally often. Our findings suggest that voter bias against women cannot explain female under-representation in American politics, even among Republicans. If anything, voters, on average, prefer women over men.

Our approach to study gender discrimination in voting complements vignette and conjoint survey experiments, which have become an established practice in political science research (Hainmueller et al., 2015; Hainmueller et al., 2014). In these studies, respondents state their preferences based on short, standardized descriptions of hypothetical candidates. Vignette and conjoint survey experiments allow studying simultaneously the effects of different cues, like gender, age, and reported experience. However, this comes at the cost that researchers define the characteristics that are presented to respondents, and how these are presented. Our approach of asking respondents to make vote choices based on candidate photographs does not require researchers to specify what textual cues are provided to respondents and in which order. Instead, we collected vote choices for hypothetical elections among all 736 Members of the European Parliament. One advantage of using MEPs was that they are real and elected politicians. Hence, the photographs likely incorporate cues that are relevant in politics, which may not be the case when using stock photographs. Another advantage of using MEPs was that American respondents are unlikely to recognize the candidates, which could have introduced bias. Finally, previous research has shown that evaluations of politicians’ photographs help to predict election outcomes around the world, providing external validity for using photographs (Antonakis & Dalgas, 2009; Ballew & Todorov, 2007; Berggren et al., 2010; Lawson et al., 2010; Todorov et al., 2005).

A major concern in all surveys is that subjects might change their behavior due to cues about what constitutes appropriate behavior (Zizzo, 2010). In our setting, the concern is that respondents would find supporting female candidates in hypothetical elections the appropriate choice, even if they would not vote for the female candidate in a real election. Our study design alleviates these concerns by randomizing gender combinations in hypothetical elections. We also did not refer to gender – but only to voting under very little information – in our task description. Furthermore, recent research has found that experimenter demand effects are small in online surveys even when respondents are provided a hint on the hypothesis that researchers are testing (de Quidt et al., 2018; Mummolo & Peterson, 2019). Comparing conjoint and vignette experiments with real referendums in Switzerland also suggests that estimates from survey experiments perform remarkably well in predicting actual voting outcomes (Hainmueller et al., 2014).

Our results emphasize the critical role of supply side factors as remaining barriers to closing the gender gap in political representation, such as women’s reluctance to enter politics and discrimination by party elites and donors, as well as the weight of historical f emale under-representation through incumbency advantage. Given that voters with prior exposure to female leaders are more likely to vote for women (Baskaran & Hessami, 2018; Beaman et al., 2009; Bhavnani, 2009), recent increases in the share of elected female politicians, and the election of Kamala Harris as the first female Vice President of the United States, could foreshadow a narrowing gender gap in years to come.


Moralization of rationality can actually stimulate the spread of news hostile to political opponents 'cause status seeking individuals moralize rationality as a form of grandstanding and use it to spread hostile information, in order to sound relevant

Marie, Antoine, and Michael Bang Petersen. 2022. “Moralization of Rationality Can Stimulate, but Intellectual Humility Inhibits, Sharing of Hostile Political Rumors.” OSF Preprints. March 4. doi:10.31219/osf.io/k7u68

Abstract: Many assume that if citizens became more inclined to moralize the values of evidence-based and logical thinking, political hostility and conspiracy theories would be less widespread. Across two large surveys (N = 3675) run in the U.S. of 2021 (one exploratory and one pre-registered), we provide the first demonstration that moralization of rationality can actually stimulate the spread of news hostile to political opponents. We provide further evidence that this counter-intuitive finding reflects that status seeking individuals moralize rationality as a form of grandstanding and use it to spread hostile information, in order to sound relevant. In contrast to such moral grandstanding with respect to rationality, our studies find robust evidence that intellectual humility—i.e., the awareness that intuitions are fallible, and that trusting others is often desirable—may protect people from both sharing and believing hostile news. Those associations generalized to all hostile news, independently of whether they are “fake” or anchored in real events.


Effect of urbanicity (metro v nonmetro) on life satisfaction, or Subjective WellBeing: The negative effect of metro vs nonmetro is equivalent to the effect of one’s health deteriorating about a third from "fair" to "poor"

Unhappy Metros: Panel Evidence. Adam Okulicz-Kozaryn. Applied Research in Quality of Life, Sep 13 2022. https://rd.springer.com/article/10.1007/s11482-022-10102-7

Abstract: We study the effect of urbanicity (metro v nonmetro) on life satisfaction, or Subjective WellBeing (SWB). The literature agrees that residents of metropolitan areas tend to be less satisfied with their lives than residents of smaller settlements in the developed world. But the existing evidence is cross-sectional only. This is the first study using longitudinal dataset to test the “unhappy metro” hypothesis. Using the 2009–2019 US Panel Study of Income Dynamics (PSID), we find support for the cross-sectional findings: metros are less happy than nonmetros. The effect size is practically significant, the negative effect of metro v nonmetro is equivalent to the effect of one’s health deteriorating about a third from “fair” to “poor.” Given extremely large scale of urbanization, projected 6b of people from 1950 to 2050, the combined effect of urbanicity on human wellbeing is large.

Notes
1    Interestingly, neuroscience is becoming interested in urbanism (Adli et al., 2017; Pykett et al., 2020), and initial empirical results indicate negative effect of urbanism on human brain (Lederbogen et al., 2011).

2    Yet, on the other hand, in a city there can be community, a neighborhood village, that at least in some ways can simulate a more natural habitat for a human (Fischer 1995, 1975, Jacobs ([1961] 1993).

3    There is a debate whether utility is happiness and it is beyond the scope of this study, for discussion see Van Der Deijl (2018), Welsh (2016), Hirschauer et al. (2015), Kenny (2011), Ng (2011), Clark et al. (2008), Frey et al. (2008), Becker and Rayo (2008), Kahneman and Krueger (2006), Kimball and Willis (2006), Kahneman and Thaler (2006), Stutzer et al. (2004), Frey and Stutzer (2002), Kahneman (2000), Frey and Stutzer (2000), Kahneman et al. (1997), Ng (1997), Kahneman and Thaler (1991), Scitovsky (1976).

4    Burger et al. (2020) also uses faulty Gallup data as elaborated in Okulicz-Kozaryn and Valente (2021)–in general, one should avoid Gallup happiness data–Gallup charges $30,000 for access (per one year), clearly “happiness industry,” not happiness research (Davies 2015).

Twitter use is related to decreased well-being, increased polarization, and increased sense of belonging with effect sizes with practical significance

de Mello, Victoria O., Felix Cheung, and Michael Inzlicht. 2022. “Twitter Use in the Everyday Life: Exploring How Twitter Use Predicts Well-being, Polarization, and Sense of Belonging.” PsyArXiv. September 12. doi:10.31234/osf.io/4x5em

Abstract: Twitter has the potential to influence public decision-making, as it is the platform used by elites in journalism, entertainment, and politics. How are users affected by Twitter? How are different effects moderated by different characteristics of the user (such as personality) and the use (such as purpose of usage)?  We conducted an experience sampling study to address these questions. We found that Twitter use is related to decreased well-being, increased polarization, and increased sense of belonging with effect sizes with practical significance. All effects had considerable heterogeneity. We did not find any evidence for interaction effects with personality, age, or gender. We found that specific usage purposes are linked to different user outcomes. Finally, we found that most of the variance in the effects was mostly driven by within-subjects effects, suggesting that these effects are not caused by third variables.

Monday, September 12, 2022

There were no differences in political orientation between incels and non-incels; approx. 38pct reported a right-leaning political affiliation, 44pct a left-leaning affiliation

Levels of Well-Being Among Men Who Are Incel (Involuntarily Celibate). William Costello, Vania Rolon, Andrew G. Thomas & David Schmitt. Evolutionary Psychological Science, Sep 12 2022. https://rd.springer.com/article/10.1007/s40806-022-00336-x

Abstract: Incels (involuntary celibates) are a subculture community of men who build their identity around their perceived inability to form sexual or romantic relationships. To address the dearth of primary data collected from incels, this study compared a sample (n = 151) of self-identified male incels with similarly aged non-incel males (n = 378) across a range of measures related to mental well-being. We also examined the role of sociosexuality and tendency for interpersonal victimhood as potential moderators of incel status and its links with mental health. Compared to non-incels, incels were found to have a greater tendency for interpersonal victimhood, higher levels of depression, anxiety and loneliness, and lower levels of life satisfaction. As predicted, incels also scored higher on levels of sociosexual desire, but this did not appear to moderate the relationship between incel status and mental well-being. Tendency for interpersonal victimhood only moderated the relationship between incel self-identification and loneliness, yet not in the predicted manner. These novel findings are some of the earliest data based on primary responses from self-identified incels and suggest that incels represent a newly identified “at-risk” group to target for mental health interventions, possibly informed by evolutionary psychology. Potential applications of the findings for mental health professionals as well as directions for future research are discussed.


Garett Jones: Defends that full assimilation in a generation or two is a myth, against a consensus that a nation's economic and political institutions won't be changed by immigration

The Culture Transplant: How Migrants Make the Economies They Move To a Lot Like the Ones They Left. Garett Jones. 2022. https://www.amazon.com/Culture-Transplant-Migrants-Make-Economies/dp/1503632946

Summary:

Over the last two decades, as economists began using big datasets and modern computing power to reveal the sources of national prosperity, their statistical results kept pointing toward the power of culture to drive the wealth of nations. In The Culture Transplant, Garett Jones documents the cultural foundations of cross-country income differences, showing that immigrants import cultural attitudes from their homelands―toward saving, toward trust, and toward the role of government―that persist for decades, and likely for centuries, in their new national homes. Full assimilation in a generation or two, Jones reports, is a myth. And the cultural traits migrants bring to their new homes have enduring effects upon a nation's economic potential.

Built upon mainstream, well-reviewed academic research that hasn't pierced the public consciousness, this book offers a compelling refutation of an unspoken consensus that a nation's economic and political institutions won't be changed by immigration. Jones refutes the common view that we can discuss migration policy without considering whether migration can, over a few generations, substantially transform the economic and political institutions of a nation. And since most of the world's technological innovations come from just a handful of nations, Jones concludes, the entire world has a stake in whether migration policy will help or hurt the quality of government and thus the quality of scientific breakthroughs in those rare innovation powerhouses.



Rolf Degen summarizing... Neuroscience's cherished idea that the dorsolateral prefrontal cortex is crucially involved in the exertion of self-control flunks the replication test

Can we have a second helping? A preregistered direct replication study on the neurobiological mechanisms underlying self-control. Christin Scholz, Hang-Yee Chan, Russell A. Poldrack, Denise T. D. de Ridder, Ale Smidts, Laura Nynke van der Laan. Human Brain Mapping, September 9 2022. https://doi.org/10.1002/hbm.26065

Abstract: Self-control is of vital importance for human wellbeing. Hare et al. (2009) were among the first to provide empirical evidence on the neural correlates of self-control. This seminal study profoundly impacted theory and empirical work across multiple fields. To solidify the empirical evidence supporting self-control theory, we conducted a preregistered replication of this work. Further, we tested the robustness of the findings across analytic strategies. Participants underwent functional magnetic resonance imaging while rating 50 food items on healthiness and tastiness and making choices about food consumption. We closely replicated the original analysis pipeline and supplemented it with additional exploratory analyses to follow-up on unexpected findings and to test the sensitivity of results to key analytical choices. Our replication data provide support for the notion that decisions are associated with a value signal in ventromedial prefrontal cortex (vmPFC), which integrates relevant choice attributes to inform a final decision. We found that vmPFC activity was correlated with goal values regardless of the amount of self-control and it correlated with both taste and health in self-controllers but only taste in non-self-controllers. We did not find strong support for the hypothesized role of left dorsolateral prefrontal cortex (dlPFC) in self-control. The absence of statistically significant group differences in dlPFC activity during successful self-control in our sample contrasts with the notion that dlPFC involvement is required in order to effectively integrate longer-term goals into subjective value judgments. Exploratory analyses highlight the sensitivity of results (in terms of effect size) to the analytical strategy, for instance, concerning the approach to region-of-interest analysis.

4 DISCUSSION

Hare et al. (2009) were among the first to provide empirical evidence on the neural correlates of self-control. Since then, this seminal study has had profound impact on theory and empirical work across multiple fields, but it has never been directly replicated. We performed a preregistered, direct replication of this experiment with two goals: (1) to further strengthen the evidence base for self-control theory and research, and (2) to test the robustness of the original results across analytical choices. The results of the four key hypothesis tests are summarized in Table 2.

TABLE 2. Hypothesis test overview
Hypothesis (quoted from Hare et al., 2009, p. 646)Replication findings
  1. [Activity] in vmPFC should be correlated with participants' goal values regardless of whether or not they exercise self-control
Supported
  1. [A]ctivity in the vmPFC should reflect the health ratings in the SC group but not in the NSC group.
Supported, with reservations
  1. [T]he dlPFC should be more active during successful than failed self-control trials.
Not supported
  1. dlPFC and vmPFC should exhibit functional connectivity during self-control trials.
Mixed evidence
  • Abbreviations: dlPFC, dorsolateral prefrontal cortex; NSC, non-self-controllers; SC, self-controllers; vmPFC, ventromedial prefrontal cortex.

Our data provide further support for the now widely accepted notion that decisions are associated with a value signal in vmPFC, which integrates relevant choice attributes to inform a final decision (Hypotheses 1 and 2; Table 2). Specifically, like Hare et al. (2009), we found positive correlations between participants' goal values (choices for food items) and activity within vmPFC, regardless of whether participants exercised self-control. We were also able to replicate findings which were reported in the original study in support of the idea that vmPFC prioritizes choice attributes that are consistent with each individual's subjective values. Specifically, as in the original study, activity in vmPFC was associated with the perceived healthiness of food items in participants who were relatively more successful at exercising self-control in the experimental task but not in participants who were relatively less successful. However, we did not find evidence of significant differences between the two groups. Overall, these results are in line with a broader set of literature in neuroeconomics, which has described the role of vmPFC in valuation across diverse types of stimuli (e.g., money, consumer goods, etc., for a review see (Bartra et al., 2013)). The present study is the first to provide a direct replication of this effect in the context of food-related decision-making. Thus, this replication study increases the confidence in choice models of self-control which describe self-control as a value-based choice (Berkman et al., 2017).

In addition to the replication of the originally reported analyses, we added several analysis branches to further test the robustness of these results. First, in a follow-up analysis to the whole-brain search for brain regions associated with goal value (Figure 4), Hare et al. (2009) highlight the fact that individual scale points (−2 –2) of goal value are neatly distinguished in a step-wise pattern in their vmPFC ROI, suggesting that the ROI can be used to precisely distinguish and predict choices. However, the original analysis approach was optimized to demonstrate this effect and requires individual-level choice data to identify individual peak-voxels within a larger vmPFC ROI. In addition, this analysis supports the limited conclusion that, on average, most study participants show this step-wise encoding of goal value in at least one voxel within a larger vmPFC area. We added an alternative analysis approach by averaging signal extracted from all voxels within the vmPFC ROI in which activity was associated with goal value in our replication sample. We show that the step-wise encoding of choice behavior is largely preserved in this more general analysis, but that the effect size is substantially smaller. Similarly, when examining relationships between health and taste ratings and average signal within vmPFC, we do not find significant encoding of health ratings in the SC group despite the relatively large size of this replication sample. In other words, future studies that are interested in reusing these vmPFC ROIs as indicators of goal value without the luxury of an individual-level localizer task that allows them to identify individual peak voxels per person likely require a much larger sample to be appropriately powered than implied by the original publication.

Further, next to vmPFC and in contrast to the original study, we identified positive associations between goal value and activity in clusters within the striatum at a relatively lenient statistical threshold (p < .001, uncorrected) used in the original study. This discovery is likely a function of the increased power in the larger replication sample and largely in line with the neuroeconomics literature on subjective valuation which regularly identifies clusters in both vmPFC and striatum (Bartra et al., 2013). Following up on this finding, we found some evidence of differentiation between individual levels of goal value, even within our caudate ROI when applying the optimized analysis procedure reported in the original study. This adds to the findings in prior work suggesting that vmPFC is not the exclusive locus of goal value representation in the human brain.

We did not find strong evidence in support of the second set of hypotheses (Hypotheses 3 and 4, Table 2) proposed by Hare et al. (2009), which highlight the role of left dlPFC in self-control. First, we examined average activity levels in left dlPFC. Even though there were clear (and replicated) behavioral differences between participants who were relatively more and those who were relatively less successful at exercising self-control in the scanner task, we did not find hypothesized, statistically significant group differences in dlPFC activity during successful self-control trials in a whole-brain analysis. Instead, we observed relative deactivation across multiple brain regions in NSC relative to SC, including, but not limited to, areas that are involved in processing of subjective values such as vmPFC. One possible alternative hypothesis supported by our data thus is that SC do not rely on more intensive executive processing indicated by higher dlPFC activity to downregulate subjective value in self-control situations, they simply perceive less intensive subjective value for “tempting” food items to begin with. Another alternative explanation is that this null finding is due to power limitations in our data, given that only 15 participants (compared to 19 in the original sample) qualified as SC. In other words, there is a possibility that positive activations in dlPFC during self-control are simply more subtle than the resulting deactivation in value-related areas. Although we cannot conclusively disentangle these contradictory ideas, note that we exclusively found negative (although nonsignificant) coefficients within dlPFC in this sample.

Next, we followed procedures reported by Hare et al. (2009) to examine the role of dlPFC in self-control in terms of its functional connectivity with brain activity in vmPFC. Since we were unable to identify a functionally defined dlPFC cluster in which average activity was involved in self-control in the replication sample, we relied on a meta-analytically defined map from www.neurosynth.org (Yarkoni et al., 2011) associated with the term “self-control” and intersected it with an anatomical, left dlPFC mask. Our analyses which fully replicated the original work by focusing exclusively on processes in participants who were relatively more successful SC during the scanner task did not replicate the original findings which suggested a negative indirect relationship between dlPFC and vmPFC activity through IFG/BA46 during self-control. We followed up on this null-result by rerunning the PPI on the full sample of participants who exercised any self-control in the scanner task (N = 59) to address concerns about statistical power. This path was chosen given the absence of strong theoretical arguments that the mechanisms that drive successful self-control differ qualitatively (rather than just in intensity) between people who are successful more often and those who are successful relatively less often. Indeed, in this larger sample, we do find some evidence of replication. Stronger still, we found evidence of direct, negative correlations between activity within our meta-analytic left dlPFC seed and an area within vmPFC, which was hypothesized, but not found by Hare et al. (2009). It is important to note, however, that we simultaneously found evidence for unexpected positive associations between activity in the left dlPFC ROI and another, more dorsal MPFC cluster. Of note here is that the whole-brain table for this analysis in the original publication revealed a similar positive association with an MPFC cluster in almost the exact same location (see Figure 11). While there was (minimal) overlap between the unexpected MPFC cluster that showed positive functional connectivity with left dlPFC and the vmPFC ROI that was associated with goal value in our sample, we did not find such overlap between the vmPFC cluster that showed the hypothesized negative association with dlPFC. In other words, the first PPI, at best, provides mixed evidence regarding the nature of the relationship between dlPFC and vmPFC activity during self-control. Hare et al. (2009) proceeded to follow-up on the lack of a negative direct association between dlPFC and vmPFC in their first PPI by identifying a cluster in BA46 that was negatively associated with dlPFC as the seed region for a second PPI. Following this analysis approach, we were able to replicate the original findings, identifying a cluster in vmPFC that was positively associated the BA46 seed identified in PP1 based on the full replication sample (N=59) and thus indirectly negatively associated with the meta-analytic dlPFC ROI. In sum, our replication data provides mixed evidence with regards to Hypothesis 4 regarding a negative relationship between dlPFC and dlPFC activity during self-control.

These mixed results highlight the need for additional work to fully understand the role of the dlPFC in food-related decision-making and in theories of self-control more generally. Overall, our findings are most in line with a conceptualization of self-control as a simple form of value-based decision-making in which different choice attributes (here health and taste considerations) are encoded and integrated in vmPFC according to subjective values of the decision-maker (Berkman et al., 2017). This contrasts with the model that the findings of Hare gave rise to, wherein longer-term goals (here health considerations) required dlPFC involvement in order to be effectively integrated into subjective value judgments (Hare et al., 2009).

A frequently voiced explanation for failed replications is that the (cultural) context differed between the original and replication study (Zwaan et al., 2018). In our case, the original study was performed in the United States before 2009 and the replication in the Netherlands, approximately 10 years later. Thus far, we are not aware of any strong theoretical or empirical claims that the brains or fundamental psychological processes surrounding self-control of US subjects are different from those of Dutch study participants or that the basic neural processes of valuation and self-control have changed over the past decade. However, what could differ between US and Dutch individuals and what could have changed over the past decade is the role of food and dieting in society, and more specifically, to what extent food choices can generate a self-control conflict and how people cope with that. This may—in theory—influence the way in which people respond to the task and stimuli. Naturally, for a self-control dilemma to occur one should have the goal to diet or eat healthy. It could be argued that stronger goal commitment may strengthen attempts of overruling impulses and therefore amplify control-related responses. Observational studies showed that the prevalence of dieting is higher in Europe than in the United States (Santos et al., 2017) and a large proportion of the Dutch population self-reports to diet or actively restrain their food intake (de Ridder et al., 2014). This would speak against this being an explanation for the null finding. It should however be noted that self-reports of dieting and dietary restraint have been shown to be unrelated or weakly related to actual intake (de Ridder et al., 2014; Stice et al., 2004) which casts doubt on this measure being a reliable proxy of goal strength. We cannot rule out but we also cannot support that goal commitment was stronger for the successful SC in the original study compared to the current replication study.

Another important conclusion from this project is that analytical flexibility can influence fMRI results. Specifically, for H1 and H2 we presented two sets of results produced using two different analysis strategies. While the overall patterns of results remained similar, increasing confidence in the directionality of effects, effect sizes differed significantly. This has important implications for follow-up research which may rely on existing work for power calculations. Previous work has shown that not only analytical flexibility but also different preprocessing approaches to the fMRI data (e.g., different software packages and varying parameters) may affect task-based fMRI results (Bowring et al., 2022; Mikl et al., 2008; Triana et al., 2020). In this replication study we employed a state-of-the-art, standardized, and optimized preprocessing pipeline provided by fMRIprep, which was not available to the authors of the original study (Esteban, Markiewicz, et al., 2018). As much as possible, we chose parameters similar to those used in the original study (e.g., the same smoothing kernel). Though submitting the data through different preprocessing pipelines was outside of the scope of the current study, we acknowledge that doing so could potentially further inform the field about the (in)variability of individual results to specific choices made by the researchers. Unpreprocessed data for this project is available on OpenNeuro and would support such an investigation for those interested.

4.1 Impact on theory

Our findings are relevant for future theorizing on self-control. Specifically, this replication data set supports the conceptualization of self-control as either a very simple form of value-based decision-making (Berkman et al., 2017) or as automatic “effortless” self-control (Gillebaart & de Ridder, 2015) rather than a dual-system which involves conscious effortful control.

In psychology, self-control has traditionally been explained with dual-system theories (e.g., Hofmann et al., 2008; Metcalfe & Mischel, 1999). These theories are characterized by the notion of two (competing) systems for processing information, namely a “hot”/automatic/impulsive system and a “cold”/rational/reflective system. According to these dual-system models, self-control is successful when the impulses arising from the “hot” system are overcome and, consequently, behavior is in line with long-term goals. In this traditional approach, the dilemma first must be identified and, subsequently, effortful and conscious inhibition is required to overcome it (Fujita, 2011). A neurobiological parallel to these dual-system models has been proposed in which self-control involves a balance between brain regions representing the reward, salience and emotional value of a stimulus and prefrontal regions associated with (effortful) inhibition and cognitive control (Heatherton & Wagner, 2011). In this traditional perspective, effortful and conscious impulse inhibition is a necessary or defining feature of (successful) self-control.

A major criticism of this traditional perspective is that successful self-control does not always require effortful inhibition or conscious control. It has been proposed that there are many different routes to self-control, only some of which involve effortful inhibition (Fujita, 2011). Research has indicated that people can automate goal-striving behaviors in response to contextual cues (Bargh et al., 2001; Chartrand & Bargh, 1996). For instance, providing cues related to the long-term goal (e.g., dieting cues) promotes goal-congruent choices through goal priming (Fishbach et al., 2003; Papies, 2016; Van der Laan et al., 2017), which is thought to occur without requiring conscious deliberation or effort. Further, by systematically repeating (healthy) behaviors (healthy) habits can be created. It has been shown that successful SC do not necessarily exert more effort; they perform healthy behaviors automatical because of healthy habits (Galla & Duckworth, 2015; Gillebaart, 2018).

This has led to alternative conceptualizations of self-control which do not include or at least attenuate the role of effortful inhibition. As mentioned, recently, successful self-control has been conceptualized as being at least partly an automatic process in which responses to environmental cues that are routinized (or automatically triggered) in the direction that is in line with their long-term goals (Fujita, 2011; Gillebaart, 2018). A second theory, which recently has gained more traction, is to consider self-control as a simple value-based choice (Berkman et al., 2017). Value-based decision-making involves choosing an option from a set based on its relative subjective value. This process involves calculating a value for each option by evaluating various attributes—gains (e.g., improved health) and costs (e.g., less food enjoyment), assigning weights to these attributes, and enacting the most valued option. It should be noted that this is a dynamic process. That is, the weight of each attribute is sensitive to attentional shifts (e.g., being explicitly guided toward certain attributes like health), contextual effects and framing of the choice set. Within this conceptualization of self-control, there is nothing special about long-term goals: attributes related to short- and long-term goals treated similar in this equation though the relative weights may be different based on the aforementioned factors. This discussion in psychology intersects with the ongoing debate in decision neuroscience and temporal discounting where Kable and Glimcher (2007) suggested there is one common valuation in vmPFC while McClure et al. (2004) suggested that separate neural systems encode value for immediate versus longer-term attributes.

The study of Hare conceptualizes self-control as a value-based decision (H1, H2) but in line with traditional dual-system models it still posi that there are dual motives and that the future part is “special”: integrating longer-term considerations into the value system, that is, changing the weight of long-term attributes, requires involvement from control-related areas (i.e., the dlPFC; H3, H4). Their hypothesis about the role of the dlPFC had its basis in the role of dlPFC in cognitive control and emotion regulation. The authors speculated that vmPFC originally evolved to predict the short-term value of stimuli and that humans developed the ability to incorporate long-term considerations into values by giving structures such as the dlPFC the ability to modulate this value.

Our mixed findings regarding dlPFC involvement highlight the need for more research to understand the role of dlPFC in assigning weight to these longer-term consequences. The replication results rather point to the conceptualization of self-control as either automatic and “effortless” or as a (simple) form of value-based decision-making. At a minimum, our results support the idea that that it is not the dlPFC that is responsible for increasing the weight of the longer-term attributes into the choice. In support of the latter: when comparing successful to unsuccessful trials that required self-control in all participants, we observed a deactivation of vmPFC, which suggests that successful self-control in this sample may be driven by a weaker subjective value for a given food item rather than by more intensive control driven by dlPFC. The finding, that in successful SC, vmPFC reflects health ratings, even though dlPFC is not active, suggests that dlPFC activation is not needed to incorporate health into the vmPFC value signal. Thus indeed, in line with the proposition of self-control as a simple form of value-based decision-making (Berkman et al., 2017), decisions may just be the result of multiple single value-calculations.

Ten years after a decisive court ruling, we are not able to identify economically or statistically significant effects of corporate political spending on state tax policy, including tax rates, discretionary tax breaks, and tax revenues

Corporate Political Spending and State Tax Policy: Evidence from Citizens United. Cailin R. Slattery, Alisa Tazhitdinova & Sarah Robinson. NBER Working Paper 30352. August 2022. DOI 10.3386/w30352

Abstract: To what extent is U.S. state tax policy affected by corporate political contributions? The 2010 Supreme Court Citizens United v. Federal Election Commission ruling provides an exogenous shock to corporate campaign spending, allowing corporations to spend on elections in 23 states which previously had spending bans. Ten years after the ruling and for a wide range of outcomes, we are not able to identify economically or statistically significant effects of corporate independent expenditures on state tax policy, including tax rates, discretionary tax breaks, and tax revenues.


Is Adolescent Bullying an Evolutionary Adaptation That Confers Fitness Benefits?

Is Adolescent Bullying an Evolutionary Adaptation? A 10-Year Review. Anthony A. Volk, Andrew V. Dane & Elizabeth Al-Jbouri. Educational Psychology Review, Sep 6 2022. https://link.springer.com/article/10.1007/s10648-022-09703-3

Abstract: Bullying is a serious behavior that negatively impacts the lives of tens of millions of adolescents across the world every year. The ubiquity of bullying, and its stubborn resistance toward intervention effects, led us to propose in 2012 that adolescent bullying might be an evolutionary adaptation. In the intervening years, a substantial amount of research has arisen to address this question. Therefore, the goal of this review is to consider whether evidence continues to support an evolutionary perspective that bullying is an adaptation that remains adaptive for some individuals in favorable contexts. In addition, we consider new ideas related to this hypothesis, explore how an evolutionary theory of bullying intersects with other influential perspectives, including ecological and social learning theories, and discuss applied implications for interventions. Our review of the evidence published since our 2012 paper provides very consistent and strong support for the hypothesis that adolescent bullying is, at least in part, an evolutionary adaptation that is currently adaptive regarding at least five evolutionarily relevant functions (the Five “Rs”): Reputation, Resources, deteRrence, Recreation, and Reproduction. We note that bullying is a facultative adaptation that is conditionally adaptive, subject to cost–benefit analyses. Finally, we discuss how an evolutionary theory of bullying frequently complements alternative theories of adolescent bullying rather than conflicting or competing with them. An interdisciplinary approach to bullying that includes evolutionary theory is thus likely to afford stronger options for both research and prevention efforts.


“Consumed by Creed”: Obsessive-compulsive Symptoms Underpin Ideological Obsession and Support for Political Violence

Adam-Troian, Jais, and Jocelyn Belanger. 2022. ““Consumed by Creed”: Obsessive-compulsive Symptoms Underpin Ideological Obsession and Support for Political Violence” PsyArXiv. September 4. doi:10.31234/osf.io/tcrd9

Abstract: Radicalization is a process by which individuals are introduced to an ideological belief system that encourages political, religious, or social change through the use of violence. Here, we formulate an obsessive-compulsive disorder (OCD) model of radicalization that links Obsessive Passion (one of the best predictors of radical intentions) to a larger body of clinical research. The model’s central tenet is that OCD tendencies shape radical intentions via their influence on Obsessive Passion. Across four ideological samples in the United States (Environmental activists, Republicans, Democrats, and Muslims, N = 1,114), we found direct effects between OCD symptoms and radical intentions, as well as indirect effects of OCD on radical intentions via Obsessive Passion. Even after controlling for potential clinical confounds (e.g., adverse childhood experiences, anxiety, depression, substance abuse), these effects remained robust, implying that OCD plays a significant role in the formation of violent ideological intentions and opening up new avenues for the treatment and prevention of violent extremism. We discuss the implications of conceptualizing radicalization as an OCD-like disorder with compulsive violent tendencies and ideology-related concerns.


Sunday, September 11, 2022

We find a positive effect of political preferences heterogamy on union dissolution; in addition, diverging opinions on the Brexit referendum is associated to higher chances of partnership break-up

Arpino, Bruno, and Alessandro Di Nallo. 2022. “Sleeping with the Enemy. Partners’ Political Attitudes and Risk of Separation.” SocArXiv. September 9. doi:10.31235/osf.io/w8etr

Abstract: Does politics conflict with love? We aim at answering this question by examining the effect on union dissolution of partners’ (mis)match on political preferences, defined as self-reported closeness, intention to vote, or vote for a specific party. Previous studies argued that partners’ heterogamy may increase risk of union dissolution because of differences among partners in lifestyles, attitudes, and beliefs, and/or because of disapproval from family and community members. We posit that similar arguments can apply to political heterogamy and test the effect of this new heterogamy dimension using UK data from the British Household Panel Study (BHPS) and the UK Household Longitudinal Study (UKHLS). The data offer a unique opportunity to disentangle the role of heterogamy by political preferences from the effects of heterogamies in other domains (e.g., ethnicity and religiosity) and from that of other partners’ characteristics, while also covering a long period of time (from 1991 to 2021). The data also allow to implement a more specific analysis about the referendum on UK’s permanence in the European Union (known as the Brexit referendum). We find a positive effect of political preferences heterogamy on union dissolution. In addition, diverging opinions on the Brexit referendum is associated to higher chances of partnership break-up.


The Effect of Taboo Language and Gesture on the Experience of Pain; against common opinion, it seems these effects are likely not due to changes in state aggression

F@#k Pain! The Effect of Taboo Language and Gesture on the Experience of Pain. Autumn B. Hostetter, Dominic Knight Rascon-Powell. Psychological Reports, September 8, 2022. https://doi.org/10.1177/00332941221125776

Abstract: Swearing has been shown to reduce the experience of pain in a cold pressor task, and the effect has been suggested to be due to state aggression. In the present experiment, we examined whether producing a taboo gesture (i.e., the American gesture of raising the middle finger) reduces the experience of pain similar to the effect that has been shown for producing a taboo word. 111 participants completed two cold pressor trials in a 2 (Language vs. Gesture) × 2 (Taboo vs. Neutral) mixed design. We found that producing a taboo act in either language or gesture increased pain tolerance on the cold pressor task and reduced the experience of perceived pain compared to producing a neutral act. We found no changes in state aggression or heart rate. These results suggest that the pain-reducing effect of swearing is shared by taboo gesture and that these effects are likely not due to changes in state aggression.

Keywords: pain, profanity, swearing, gesture, hypoalgesic


Saturday, September 10, 2022

Behavioral scientists are consistently no better than, and often worse than, simple heuristics and models; why have markets & experience not eliminated their biases entirely?

Simple models predict behavior at least as well as behavioral scientists. Dillon Bowen. arXiv, August 3, 2022. https://arxiv.org/abs/2208.01167

Abstract: How accurately can behavioral scientists predict behavior? To answer this question, we analyzed data from five studies in which 640 professional behavioral scientists predicted the results of one or more behavioral science experiments. We compared the behavioral scientists’ predictions to random chance, linear models, and simple heuristics like “behavioral interventions have no effect” and “all published psychology research is false.” We find that behavioral scientists are consistently no better than - and often worse than - these simple heuristics and models. Behavioral scientists’ predictions are not only noisy but also biased. They systematically overestimate how well behavioral science “works”: overestimating the effectiveness of behavioral interventions, the impact of psychological phenomena like time discounting, and the replicability of published psychology research

Keywords: Forecasting, Behavioral science

3 Discussion
Critical public policy decisions depend on predictions from behavioral scientists. In this paper, we asked how accurate those predictions are. To answer this question, we compared the predictions of 640 behavioral scientists to those of simple mathematical models on five prediction tasks. Our sample included a variety of behavioral scientists: economists, psychologists, and business professionals from academia, industry, and government. The prediction tasks also covered various domains, including text-message interventions to increase vaccination rates, behavioral nudges to increase exercise, randomized control trials, incentives to encourage effort, and attempts to reproduce published psychology studies. The models to which we compared the behavioral scientists were deliberately simple, such as random chance, linear interpolation, and heuristics like “behavioral interventions have no effect” and “all published psychology research is false.” We consistently found that behavioral scientists are no better than - and often worse than - these simple heuristics and models. In the exercise, flu, and RCT studies, null models significantly outperformed behavioral scientists. These null models assume that behavioral treatments have no effect; behavioral interventions will not increase weekly gym visits, text messages will not increase vaccination rates, and nudges will not change behavior. As we can see in Table 1, compared to behavioral scientists, null models are nearly indistinguishable from the oracle. In the effort study, linear interpolations performed at least as well as professional economists. These interpolations assumed that all psychological phenomena are inert; people do not exhibit risk aversion, time discounting, or biases like framing effects. In the reproducibility study, professional psychologists’ Brier scores were virtually identical to those of a null model, which assumed that all published psychology research is false. Professional psychologists were significantly worse than both linear regression and random chance. Notably, the linear regression model used data from the reproducibility study, which were not accessible to psychologists during their participation. While this is not a fair comparison, we believe it is a useful comparison, as the linear regression model can serve as a benchmark for future attempts to predict reproducibility. Why is it so hard for behavioral scientists to outperform simple models? One possible answer is that human predictions are noisy while model predictions are not [Kahneman et al., 2021]. Indeed, there is likely a selection bias in the prediction tasks we analyzed. Recall that most of the prediction tasks asked behavioral scientists to predict the results of ongoing or recently completed studies. Behavioral scientists presumably spend time researching questions that have not been studied exhaustively and do not have obvious answers. In this case, the prediction tasks were likely exceptionally challenging, and behavioral scientists’ expertise would be of little use. However, behavioral scientists’ predictions are not only noisy but also biased. Previous research noted that behavioral scientists overestimate the effectiveness of nudges [DellaVigna and Linos, 2022, Milkman et al., 2021]. Our research extends these findings, suggesting that behavioral scientists believe behavioral science generally “works” better than it does. Behavioral scientists overestimated the effectiveness of behavioral interventions in the exercise, flu, and RCT studies. In the exercise study, behavioral scientists significantly overestimated the effectiveness of all 53 treatments, even after correcting for multiple testing. Economists overestimated the impact of psychological phenomena in the effort study, especially for motivational crowding out, time discounting, and social preferences. Finally, psychologists significantly overestimated the replicability of published psychology research in the reproducibility study. In general, behavioral scientists overestimate not only the effect of nudges, but also the impact of psychological phenomena and the replicability of published behavioral science research. Behavioral scientists’ bias can have serious consequences. A recent study found that policymakers were less supportive of an effective climate change policy (carbon taxes) when a nudge solution was also available [Hagmann et al., 2019]. However, accurately disclosing the nudge’s impact shifted support back towards carbon taxes and away from the nudge solution. In general, when behavioral scientists exaggerate the effectiveness of their work, they may drain support and resources from potentially more impactful solutions. Our results raise many additional questions. For example, is it only behavioral scientists who are biased, or do people, in general, overestimate how well behavioral science works? The general public likely has little exposure to RCTs, social science experiments, and academic psychology publications, so there is no reason to expect that they are biased in either direction. Then again, the little exposure they have had likely gives an inflated impression of behavioral science’s effectiveness. For example, a TED talk with 64 million as of May 2022 touted the benefits of power posing, whereby one can reap the benefits of improved self-confidence and become more likely to succeed in life by adopting a powerful pose for one minute [Carney et al., 2010, Cuddy, 2012]. However, the power posing literature was based on p-hacked results [Simmons and Simonsohn, 2017], and researchers have since found that power posing yields no tangible benefits [Jonas et al., 2017]. Additionally, people may generally overestimate effects due to the “What you see is all there is” (WYSIATI) bias [Kahneman, 2011]. For example, the exercise study asked behavioral scientists to consider, among other treatments, how much more people would exercise if researchers told them they were “gritty.” After the initial “gritty diagnosis,” dozens of other factors determined how often participants in that condition went to the gym during the following four-week intervention period. Work schedule, personal circumstances, diet, mood changes, weather, and many other factors also played key roles. These other factors may not have even crossed the behavioral scientists’ minds. The WYSIATI bias may have caused them to focus on the treatment and ignore the noise of life that tempers the treatment’s signal. Of course, this bias is likely to cause everyone, not only behavioral scientists, to overestimate the effectiveness of behavioral interventions and the impact of psychological phenomena. If people generally overestimate how well behavioral science works, are they more or less biased than behavioral scientists? Experimental economics might suggest that behavioral scientists are less biased because people with experience tend to be less biased in their domain of expertise. For example, experienced sports card traders are less susceptible to the endowment effect [List, 2004], professional traders exhibit less ambiguity aversion than novices [List and Haigh, 2010], experienced bidders are immune to the winner’s curse [Harrison and List, 2008], and CEOs who regularly make high-stakes decisions are less susceptible to possibility and certainty effects [List and Mason, 2011]. Given that most people have zero experience with behavioral science, they should be more biased than behavioral scientists. Then again, there are at least three reasons to believe that behavioral scientists should be more biased than the general population: selection bias, selective exposure, and motivated reasoning. First, behavioral science might select people who believe in its effectiveness. On the supply side, students who apply to study psychology for five years on a measly PhD stipend are unlikely to believe that most psychology publications fail to replicate. On the demand side, marketing departments and nudge units may be disinclined to hire applicants who believe their work is ineffective. Indeed, part of the experimental economics argument is that markets filter out people who make poor decisions [List and Millimet, 2008]. The opposite may be true of behavioral science: the profession might filter out people with an accurate assessment of how well behavioral science works. Second, behavioral scientists are selectively exposed to research that finds large and statistically significant effects. Behavioral science journals and conferences are more likely to accept papers with significant results. Therefore, most of the literature behavioral scientists read promotes the idea that behavioral interventions are effective and psychological phenomena substantially influence behavior. However, published behavioral science research often fails to replicate. Lack of reproducibility plagues not only behavioral science [Collaboration, 2012, 2015, Camerer et al., 2016, Mac Giolla et al., 2022] but also medicine [Freedman et al., 2015, Prinz et al., 2011], neuroscience [Button et al., 2013], and genetics [Hewitt, 2012, Lawrence et al., 2013]. Scientific results fail to reproduce for many reasons, including publication bias, p-hacking, and fraud [Simmons et al., 2011, Nelson et al., 2018]. Indeed, most evidence that behavioral scientists overestimate how well behavioral science works involves asking them to predict the results of nudge studies. However, there is little to no evidence that nudges work after correcting for publication bias [Maier et al., 2022]. Even when a study successfully replicates, the effect size in the replication study is often much smaller than that reported in the original publication [Camerer et al., 2016, Collaboration, 2015]. For example, the RCT study paper estimates that the academic literature overstates nudges’ effectiveness by a factor of six [DellaVigna and Linos, 2022]. Finally, behavioral scientists might be susceptible to motivated reasoning [Kunda, 1990, Epley and Gilovich, 2016]. As behavioral scientists, we want to believe that our work is meaningful, effective, and true. Motivated reasoning may also drive selective exposure [B´enabou and Tirole, 2002]. We want to believe our work is effective, so we disproportionately read about behavioral science experiments that worked. Our analysis finds mixed evidence of the relationship between experience and bias in behavioral science. The RCT study informally examined the relationship between experience and bias for behavioral scientists predicting nudge effects and concluded that more experienced scientists were less biased. While we also estimate that more experienced scientists are less biased, we do not find statistically significant pairwise differences between the novice, moderately experienced, and most experienced scientists. Even if the experimental economics argument is correct that behavioral scientists are less biased than the general population, why are behavioral scientists biased at all? The experimental economics literature identifies two mechanisms to explain why more experienced people are less biased [List, 2003, List and Millimet, 2008]. First, markets filter out people who make poor decisions. Second, experience teaches people to think and act more rationally. We have already discussed that the first mechanism might not apply to behavioral science. And, while our results are consistent with the hypothesis that behavioral scientists learn from experience, they still suggest that even the most experienced behavioral scientists overestimate the effectiveness of nudges. The remaining bias for the most experienced scientists is larger than the gap between the most experienced scientists and novices. Why has experience not eliminated this bias entirely? Perhaps the effect of experience competes with the forces of “What you see is all there is,” selection bias, selective exposure, and motivated reasoning such that experience mitigates but does not eliminate bias in behavioral science. Finally, how can behavioral scientists better forecast behavior? One promising avenue is to use techniques that help forecasters predict political events [Chang et al., 2016, Mellers et al., 2014]. For example, the best political forecasters begin with base rates and then adjust their predictions based on information specific to the event they are forecasting [Tetlock and Gardner, 2016]. Behavioral scientists’ predictions would likely improve by starting with the default assumptions that behavioral interventions have no effect, psychological phenomena do not influence behavior, and published psychology research has a one in three chance of replicating [Collaboration, 2012]. Even though these assumptions are wrong, they are much less wrong than what behavioral scientists currently believe.