Saturday, February 11, 2023

There is little evidence of systematic support for wokeness among executives or the population; many firms embrace wokeness because middle managers engage in woke internal advocacy, which may increase their influence & job security

Foss, Nicolai J. and Klein, Peter G., Why Do Companies Go Woke? (November 23, 2022). SSRN:

Abstract: “Woke” companies are those that are committed to socially progressive causes, with a particular focus on diversity, equity, and inclusion as these terms are understood through the lens of critical theory. There is little evidence of systematic support for woke ideas among executives and the population at large, and going woke does not appear to improve company performance. Why, then are so many firms embracing woke policies and attitudes? We suggest that going woke is an emergent strategy that is largely shaped by middle managers rather than owners, top managers, or employees. We build on theories from agency theory, institutional theory, and intra-organizational ecology to argue that wokeness arises from middle managers and support personnel using their delegated responsibility and specialist status to engage in woke internal advocacy, which may increase their influence and job security. Broader social and cultural trends tend to reinforce this process. We discuss implications for organizational behavior and performance including perceived corporate hypocrisy (“woke-washing”), the potential loss of creativity from restricting viewpoint diversity, and the need for companies to keep up with a constantly changing cultural landscape.

Keywords: DEI, diversity, social responsibility, strategic change, organizational structure

JEL Classification: M14, L21, M12, M52

Pervasively, not only in seminars: "Each intervention in a seminar is incomplete, and gets things wrong. Each subsequent intervention is also incomplete, and also gets things wrong"

A Black Professor Trapped in Anti-Racist Hell. Vincent Lloyd. Compact, Feb 2023.


This might be just another lament about “woke” campus culture, and the loss of traditional educational virtues. But the seminar topic was “Race and the Limits of Law in America.” Four of the 6 weeks were focused on anti-black racism (the other two were on anti-immigrant and anti-indigenous racism). I am a black professor, I directed my university’s black-studies program, I lead anti-racism and transformative-justice workshops, and I have published books on anti-black racism and prison abolition. I live in a predominantly black neighborhood of Philadelphia, my daughter went to an Afrocentric school, and I am on the board of our local black cultural organization.

Like others on the left, I had been dismissive of criticisms of the current discourse on race in the United States. But now my thoughts turned to that moment in the 1970s when leftist organizations imploded, the need to match and raise the militancy of one’s comrades leading to a toxic culture filled with dogmatism and disillusion. How did this happen to a group of bright-eyed high school students?


I am no stranger to anti-racism workshops: I have participated in many of them, and I have facilitated them myself. But the Telluride workshops were being organized by two college-age students, filled with the spirit of the times. From what I gleaned, they involved crudely conveying certain dogmatic assertions, no matter what topic the workshops were ostensibly about:

Experiencing hardship conveys authority.

There is no hierarchy of oppressions—except for anti-black oppression, which is in a class of its own. 

Trust black women.

Prison is never the answer.

Black people need black space.

Allyship is usually performative.

All non-black people, and many black people, are guilty of anti-blackness.

There is no way out of anti-blackness.

The seminar form pulls against the form of the anti-racism workshop, and Telluride was trying to have them both at once. By its nature, a seminar requires patience. Day by day, one intervention builds on another, as one student notices what another student overlooked, and as the professor guides the discussion toward the most important questions. All of this is grounded in a text: Specific words, phrases, arguments, and images from a text offer essential friction for conversation, holding seminar participants accountable to something concrete. The instructor gently—ideally, almost invisibly—guides discussion toward what matters.

The seminar assumes that each student has innate intelligence, even as we come from different backgrounds, have different amounts and sorts of knowledge, and different skills. We can each be formed best if we take advantage of our differing insights to push each other, over time, again and again. When this practice is occasioned by carefully curated texts—not exclusively “great books,” but texts that challenge each other and us as they probe issues of essential importance—a seminar succeeds.

A seminar takes time. The first day, you will be frustrated. The second and the third day, you will be frustrated. Even on the last day, you will be frustrated, though ideally now in a different way. Each intervention in a seminar is incomplete, and gets things wrong. Each subsequent intervention is also incomplete, and also gets things wrong. But there are plenty of insights and surprises, for each participant looks at a text with different eyes.

It is tempting to add: Such is life. Such is democratic life. We each have different, partial knowledge. We each get things wrong, over and over. At our best, we enter the fray by listening to each other and complementing and challenging the insights of our fellows. In the process, over years, decades, we are oriented toward justice and truth.


In the 2022 anti-racism workshops, the non-black students learned that they needed to center black voices—and to shut up. Keisha reported that this was particularly difficult for the Asian-American students, but they were working on it. (Eventually, two of the Asian-American students would be expelled from the program for reasons that, Keisha said, couldn’t be shared with me.) The effects on the seminar were quick and dramatic. During the first week, participation was as you would expect: There were two or three shy students who only spoke in partner or small-group work, two or three outspoken students, and the rest in the middle. One of the black students was outspoken, one was in the middle, and one was shy. By the second week of the seminar, the two white students were effectively silent. Two of the Asian-American students remained active (the ones who would soon be expelled), but the vast majority of interventions were from the three black students. The two queer students, one Asian and one white, were entirely silent. The black students certainly had interesting things to say and important connections to make with their experiences and those of their family members, but a seminar succeeds when multiple perspectives clash into each other, grapple with each other, and develop—and that became impossible.

In the 2022 anti-racism workshops, the non-black students learned that they needed to center black voices—and to shut up. Keisha reported that this was particularly difficult for the Asian-American students, but they were working on it. (Eventually, two of the Asian-American students would be expelled from the program for reasons that, Keisha said, couldn’t be shared with me.) The effects on the seminar were quick and dramatic. During the first week, participation was as you would expect: There were two or three shy students who only spoke in partner or small-group work, two or three outspoken students, and the rest in the middle. One of the black students was outspoken, one was in the middle, and one was shy. By the second week of the seminar, the two white students were effectively silent. Two of the Asian-American students remained active (the ones who would soon be expelled), but the vast majority of interventions were from the three black students. The two queer students, one Asian and one white, were entirely silent. The black students certainly had interesting things to say and important connections to make with their experiences and those of their family members, but a seminar succeeds when multiple perspectives clash into each other, grapple with each other, and develop—and that became impossible.


In their “transformative-justice” workshop, my students learned to name “harms.” This language, and the framework it expresses, come out of the prison-abolition movement. Instead of matching crimes with punishments, abolitionists encourage us to think about harms and how they can be made right, often through inviting a broader community to discern the impact of harms, the reasons they came about, and paths forward. In the language of the anti-racism workshop, a harm becomes anything that makes you feel not quite right. For a 17-year-old at a highly selective, all-expenses-paid summer program, newly empowered with the language of harm, there are relatively few sites at which to use this framework. My seminar became the site at which to try out—and weaponize—this language.

During our discussion of incarceration, an Asian-American student cited federal inmate demographics: About 60 percent of those incarcerated are white. The black students said they were harmed. They had learned, in one of their workshops, that objective facts are a tool of white supremacy. Outside of the seminar, I was told, the black students had to devote a great deal of time to making right the harm that was inflicted on them by hearing prison statistics that were not about blacks. A few days later, the Asian-American student was expelled from the program. Similarly, after a week focused on the horrific violence, death, and dispossession inflicted on Native Americans, Keisha reported to me that the black students and their allies were harmed because we hadn’t focused sufficiently on anti-blackness. When I tried to explain that we had four weeks focused on anti-blackness coming soon, as indicated on the syllabus, she said the harm was urgent; it needed to be addressed immediately.


They alleged: I had used racist language. I had misgendered Brittney Griner. I had repeatedly confused the names of two black students. My body language harmed them. I hadn’t corrected facts that were harmful to hear when the (now-purged) students introduced them in class. I invited them to think about the reasoning of both sides of an argument, when only one side was correct. The students ended with a demand: In light of all the harms they had suffered, they could only continue in the class if I abandoned the seminar format and instead lectured each day about anti-blackness, correcting any of them who questioned orthodoxy.

The Shawshank Redemption Effect: In the US, the public's desire for harsh criminal punishment (including capital punishment) has been steadily declining

Public support for second look sentencing: Is there a Shawshank redemption effect?. Kellie R. Hannan, Francis T. Cullen, Amanda Graham, Cheryl Lero Jonson, Justin T. Pickett, Murat Haner, Melissa M. Sloan. Criminology & Public Policy, February 2 2023.


Research Summary: Washington, DC has implemented second look sentencing. After serving a minimum of 15 years in prison, those convicted of a serious offense committed while under the age of 25 years can petition a judge to take a “second look” and potentially release them from incarceration. To examine both global and specific support for second look sentencing, we embedded experiments in a 2021 MTurk survey and in a follow-up 2022 YouGov survey. Two key findings emerged. First, regardless of whether a crime was committed under 18 years or under 25 years of age, a majority of the public supported second look sentencing. Opposition to the policy was low, even for petitioners convicted of murder. Second, as revealed by vignette ratings, respondents were more likely to support release when a petitioner “signaled” their reform (e.g., completed a rehabilitation program, received a recommendation from the warden) and had the support of the victim (or their family).

Policy Implications: The critique of mass imprisonment has broadened from a focus on the level of incarceration to the inordinate length of sentences being served by some prisoners. Policies are being proposed to reconsider these long sentences and to provide opportunities for earned release. Second look sentencing in DC is one of these reforms. Our research suggests that many members of the public believe in a “Shawshank redemption” effect—that those committing serious crimes as a teenager or young adult can mature into a “different person” and warrant a second look, with the possibility of early release if they have earned it. A key issue is likely to be how much weight is accorded to the preference of victims or their families in any release decision.

Same race teachers do not necessarily raise academic achievement

Same race teachers do not necessarily raise academic achievement. Jeffrey Penney. Economics Letters, Volume 223, February 2023, 110993.

Abstract: Numerous studies have found that students who are of the same race as their teacher experience increased academic achievement. In this paper, I attempt to explain when these benefits occur and which students are most likely to achieve the largest gains. Using exogenous variation in student–teacher matches and classroom composition from Tennessee’s Project STAR experiment, I find that below average achieving students benefit most from having a teacher of the same race, but the benefits from matching can be substantially reduced in smaller classes. Moreover, the effect is decreased in racially homogeneous classes where the teacher is the majority race.


There is substantial evidence that students do better academically when matched with a teacher of the same race (e.g. Dee, 2004, Fairlie et al., 2014, Egalite et al., 2015, Penney, 2017, Delhommer, 2022).1 However, educational interventions thought to improve student achievement can potentially fail to achieve the desired results under some circumstances; for example, Gilraine (2020) finds that class size reductions only increase student achievement when experienced instructors are teaching the smaller classes.

In this paper, I investigate the heterogeneities of own-race teacher effects, considering their distribution and examining which scenarios they are most likely to be beneficial. I conduct an empirical analysis making use of data from Project STAR, a large-scale education experiment in Tennessee that was designed to investigate the effects of class size and teacher’s aides on student achievement. This is the first paper that focuses on investigating own-race teacher effects specifically with regards to student achievement and classroom composition using experimental data, filling a crucial gap in the literature.

The results of this analysis are as follows. I find that the benefit of student–teacher racial matching varies with academic ability, with lower-scoring students generally profiting the most whereas above-average achieving students see no distinguishable increase in test scores. An examination of the effects of classroom composition reveals that the magnitude of own-race teacher effects varies along this dimension: where the teacher shares the same race as only a few of their students, the own-race teacher effect on mathematics and reading scores is high; if instead almost all students in the class are the same race as their teacher, it is considerably lower.

This paper is organized as follows. Section 2 outlines the data from the Project STAR experiment and considers several threats to validity. The empirical analysis takes place in Section 3. The paper concludes with a discussion of the policy implications in Section 4.

Rolf Degen summarizing... Almost 60% of the population succumb, at least occasionally, to the "strange face illusion," where their own face can take on strange or even uncanny features

Strange face illusions: A systematic review and quality analysis. Joanna Mash et al. Consciousness and Cognition. Volume 109, March 2023, 103480.


• Strange face illusions are commonly experienced when fixating on face stimuli for prolonged periods under low light levels.

• Strange face illusions involve either the perception of distorting facial features on the actual face being observed or the perception of completely new strange faces.

• Illusions of new faces show a mean prevalence rate of 58% in healthy participants.

• Age and gender are positively related to prevalence rates, with more new faces reported by samples of older individuals and those with more female participants.

• Some key areas of methodological concern were apparent in the studies, relating to small unjustified sample sizes, a lack of preregistration and providing conclusions and interpretations that do not seem to be justified by results.

• Future research should aim to examine how the behavioural and environmental components of this paradigm combine to result in illusory faces and explain why illusions are more prevalent when gazing at others compared to self-reflections.


Background: Strange face illusions describe a range of visual apparitions that occur when an observer gazes at their image reflected in a mirror or at another person’s face in a dimly lit room. The illusory effects range from mild alterations in colour, or contrast, to the perception of distorted facial features, or new strange faces. The current review critically evaluates studies investigating strange face illusions, their methodological quality, and existing interpretations.

Method: Searches conducted using Scopus, PubMed, ScienceDirect and the grey literature until June 2022 identified 21 studies (N = 1,132; healthy participants n = 1,042; clinical participants n = 90) meeting the inclusion criteria (i.e., providing new empirical evidence relating to strange face illusions). The total sample had a mean age of 28.3 years (SD = 10.31) and two thirds (67 %) of participants tested to date are female. Results are reported using the Preferred Reporting Items for Systematic Reviews and meta-Analyses (PRISMA) guidelines. The review was preregistered at the Open Science Framework (OSF:

Results: Pooling data across studies, illusory new strange faces are experienced by 58% (95%CI 48 to 68) of nonclinical participants. Study quality as assessed by the Appraisal Tool for Cross-Sectional Studies (AXIS) revealed that 3/21 (14.28%) studies were rated as high, 9/21 (42.86%) as moderate and 9/21 (42.86%) as low quality. Whilst the items relating specifically to reporting quality scored quite highly, those relating to study design and possible biases were lower and more variable. Overall, study quality accounted for 87% of the variance in reporting rates for strange faces, with higher quality being associated with lower illusion rates. The prevalence of illusions was also significantly greater in samples that were older, had higher proportions of female participants and for the interpersonal dyad (IGDT) compared to the mirror gaze paradigm (MGT). The moderating impact of study quality persisted in a multiple meta-regression involving participant age, paradigm type (IGDT vs MGT) and level of feature distortion. Our review point to the importance of reduced light levels, face stimuli and prolonged eye fixation for strange face illusions to emerge.

Conclusion: Strange face illusions reliably occur in both mirror-gazing and interpersonal gazing dyad paradigms. Further research of higher quality is required to establish the prevalence and particularly, the mechanisms underpinning strange face illusions.

Keywords: IllusionsAnomalous subjective experiencesDissociationMirror gazingPerceptual distortion

4. Discussion

The present systematic review and meta-analysis provides a comprehensive synthesis and evaluation of the strange face illusion literature, the methodological quality of studies, and existing interpretations of strange face illusions. Our searches identified 21 studies (N = 1,132) involving non-clinical and clinical samples conducted over the past 12 years. Based on 17 datasets derived from non-clinical participants, we estimate the prevalence of illusions of new strange faces to be reported by almost 60 % of individuals.

Assessment of study quality using the AXIS revealed that overall quality for most studies (85 %) was in the low-to-moderate range. At a more specific level, while AXIS items relating to reporting quality were good, issues arose concerning study design and possible biases. Key areas of methodological concern are the often-small sample sizes, the lack of a priori power analysis, providing conclusions and interpretations that are not justified by the results, and a lack of discussion of limitations. Discussion of limitations is a key part of both scientific discourse and scientific progress, allowing readers to assess the validity of scientific work and to contextualise research findings (see Ioannidis, 2007), and has been viewed by some as partly a failure of the peer review process (Horton, 2002). Although not part of the AXIS, we also note that to-date, no published studies have been preregistered.

Meta-regression analyses identified two participant-based variables with higher rates of strange face reporting in samples that are older and with a higher proportion of female participants. Whilst exploratory, these findings are intriguing and not previously documented. The finding that strange face illusion reports were higher in samples with more female participants requires further investigation. The finding on age contrasts with work showing that both auditory and visual hallucinatory-type experiences in the general population tend to be more common in younger rather than older individuals (e.g., Larøi et al., 2019; Maijer et al., 2018). This suggests perhaps that strange face illusions may be different in kind from other hallucinatory-type experiences reported in the general population. Another possibility is that older individuals are more reticent to report spontaneous hallucinatory experiences, but the strange face paradigm provides a less threatening context to explore the emergence of unusual visual experiences. At a pragmatic level, poorer low-light vision in older individuals may also play a role in their higher rates of strange face illusions (see Beck & Harris, 1994). Despite this emphasis on age, most samples assessed within the strange face paradigm have been quite young (overall mean age of 28.3 years). Future research is required samples to determine if the age effect persists across a broader age range. Closer examination of age effects in future studies may also help reveal the mechanisms underlying SFIs, in the same way that studies of age effects in other visual, auditory, and multisensory illusions have provided unique insights into their cause (Billino et al., 2009Campos et al., 2018Doherty et al., 2010Hirst et al., 2019Mullin et al., 2021). Finally, we note that the recent Rasch analysis of SFQ items by Lange et al. (2022) did not find any evidence of significant differential item bias relating to either age or sex. This suggests that the moderating impact of both sex and gender reported in our meta-analysis are probably not a reflection of SFQ item bias.

To date, studies have rarely systematically assessed how the manipulation of study design features impact the reporting of strange face illusions. Our exploratory meta-regression analyses identified the importance of overall study quality (AXIS ratings), while sub-group analyses showed reporting rates are also significantly impacted by the paradigm employed (being greater for IGDT than MGT) and whether lux was measured at the face (greater when lux measured than not). The level of strange face illusions was also highly related to the level of feature distortions reported across studies. Most striking however, was the finding that study quality accounted for 87 % of the variance in prevalence rates, which crucially remained the only significant predictor of SFIs when all variables were entered into a multiple meta-regression. With current evidence, it is difficult to completely unpack this finding, as some of the mentioned variables may be confounded or interact e.g., lower quality studies have tended to also examine samples with more women and who are older. So, while demographic variables (age and gender), procedural variables (paradigm type, whether lux is measured at the face) and the levels of reported feature distortions are important to assess, study quality remains the best predictor of strange face illusion prevalence – being higher in studies with lower rated quality.

The influence of the direct environment is a key factor in this paradigm because dimmed light only allows the observer to perceive a vague view of the face. Only one study has examined manipulating light levels (Caputo, 2010b) - using two levels (0.8 vs 5 lx) with a small sample (n = 8) in the MGT. In this counterbalanced within-subject design, all eight participants reported apparitions of a new face in both conditions, but significantly more in the lower light condition. Caputo also reported a significantly quicker time-to-onset of illusions in lower-light levels (34.75 s vs 62.57 s). Generally, researchers have advocated that 0.8 lx measured at the face is the optimal level for illusion induction. In this context, we note that measuring the lux-value at the face has been inconsistent across studies. Our analyses show that reporting of strange faces is more common when lux value is established within the suggested range compared to when lux value is not assessed at all (74 % vs 41 % respectively). However, this finding is limited by the fact that all studies that measuring lux levels at the face were conducted by the same author (G Caputo).

Studies show that different forms of facial configuration can induce strange face illusions, including self-face reflections, the faces of others and even face masks. By contrast, non-face stimuli such as the torso of the body (Jenkinson & Preston, 2017) or a simple dot (Caputo, 2013) fail to elicit illusions. It remains unclear however if it is faces per se, stimulus complexity, expertise, or familiarity with the task (i.e., face-gazing) that drives the illusion. Moreover, since masks induce the effect (Caputo, 2011), the face does not need to be human or show mobility – suggesting that a face-like configuration is sufficient to induce illusions.

While faces in various formats induce illusions, it is also notable that the reporting of new faces is significantly greater for interpersonal gazing than for mirror gazing (76 % vs 50 %). Although the IGDT and MGT paradigms share common experimental components necessary to induce face-related illusions (i.e., prolonged gaze fixation, low light levels and facial stimuli), the greater prevalence for the IGDT suggests that additional paradigm-specific factors may also be relevant. The IGDT clearly differs in terms of its social context – involving the presence of strangers in a potentially awkward or unusual social situation, where participants are required to stare intently at each other. The IGDT has also been associated with greater levels of dissociation, with CADSS scores of around 27 (Caputo, 2015Caputo, 2019) while the MGT has lower CADSS scores, ranging from 7.8 (Nisticò et al., 2020) to 18.72 (Brewin et al., 2013). Whether dissociation is a precursor, a consequence or coincidental with the illusion remains to be established. Nonetheless, links between prolonged fixation and dissociation are well-documented and occurs irrespective of the stimulus type (object, dot, own face in the mirror, photographed face: see Möllmann et al., 2019). Mild dissociation and very mild dysmorphic effects, such as an increase in perceived unattractiveness (Mollman et al., 2019) often co-occur to a minor degree during any mirror-gazing. Prolonged fixation may well underpin the emergence of feature distortions. Indeed, Caputo proposed that the fixation triggered Troxler effect “can explain the merging of facial features into a uniform silhouette of the facial contour” (Caputo, 2014, p5). Troxler fading (Troxler, 1804) typically occurs when fixation is maintained on a particular point on an unchanging stimulus, and even after short durations the peripheries (i.e., away from the fixation point) will fade away and disappear. Troxler fading however can only account for the disappearance of features surrounding the point of fixation on the face, which is a commonly reported illusory effect in this paradigm, but not for the merging or blending of features. These illusory effects likely result from other perceptual processes such as, for example, perceptual (textural) filling-in (Komatsu, 2006Hsieh and Tse, 2009). Such perceptual processes may be employed to deal with a paucity of sensory data arising from the combination of prolonged gaze fixation (i.e., impairing our ability to selectively harvest higher acuity visual information) and low light levels (i.e., impairing one's ability to discriminate fine details of the face, attenuating colour perception etc.).

When appraising the role of prolonged fixation, the evidence assessing SFIs in various clinical groups could prove particularly informative. Whilst increased rates of illusions have been documented in individuals with anorexia nervosa (Demartini et al., 2020), hospitalised, depressed patients tended only to perceive mild feature distortions, with almost two-thirds not seeing any illusions at all (Caputo et al., 2014). Given that prolonged fixation seems crucial to the generation of strange face illusions, some of the variation between clinical groups may derive from differences in ability to maintain fixation and the fact that atypical eye-movement accompanies some disorders. For instance, compared to healthy controls, patients with depressive disorder show significantly abnormal eye-movement indices. For example, patients with depressive disorder exhibit shorter fixation durations (Li et al., 2016) and this may potentially account for their reduced susceptibility to illusions in this paradigm. By contrast, people diagnosed with schizophrenia and non-clinical participants have remarkably similar fixation performance in terms of number and duration of fixations (Kissler and Clementz, 1998Manor et al., 1999) and so, might be as prone to the illusion as healthy controls. We note however that the findings in clinical groups have yet to be replicated and currently comprise analyses of relatively small samples. Furthermore, it would be crucial to investigate if any links between proneness to strange face illusions and transdiagnostic fixation issues reflect state or trait aspects of such disorders.

The loss of actual face-recognition has frequently been interpreted in this literature as a loss in self-identity (Brewin and Mersaditabari, 2013Brewin et al., 2013Caputo, 2010aCaputo, 2011Caputo, 2013Caputo, 2015Caputo, 2016Caputo, 2019Caputo, 2021Caputo et al., 2012, 2014). Although MGT studies might lend themselves towards such an interpretation, a loss of self-identity cannot account for strange face illusions in the IGDT paradigm where self-recognition is not a factor, but where the illusion is more frequently reported. Our finding that illusions are reported by significantly more individuals in the IGDT than MGT paradigm does however accord with Caputo’s (2013) speculation that “If empathy is involved, then one should expect a higher frequency of illusions in inter-subjective gazing than in mirror-gazing” (p. 327). Empathy has been seen as central to increased dissociation and illusion formation in both the MGT (Caputo, 2016) and in IGDT (Caputo, 2013) paradigms, although dyad inducement has been linked loosely to Jungian notions of synchronicity. Nonetheless, only four studies to-date have employed the IGDT paradigm (Caputo, 2013Caputo, 2015Caputo, 2017Caputo, 2019) and further research is required to address the converging and diverging mechanisms and moderators across the two paradigms.

A key conceptual notion concerns whether SFIs are akin to the depersonalisation-like symptoms of “not recognising oneself in the mirror” (Fonseca-Pedrero et al., 2015Caputo et al., 2020Derome et al., 2018, 2022) or even “out of body” experiences (Caputo, 2014). In this context, Caputo’s (2019) factor-analysis of Strange Face Questionnaire (SFQ) and Clinically Administered Dissociative States Scale (CADSS) data, from 90 healthy participants who participated in the IGDT, identified three factors. Feature distortions and most types of SFQ responses (8 items) loaded onto a derealisation factor (anomalous experiences of external reality, including faces). A further 7 items loaded onto a dissociative identity factor (anomalous experiences of identity/self) - although this factor was independent of any sub-type of dissociation as measured by the CADSS). Only four out of 19 items loaded onto the final factor identified as depersonalisation. More recently, Lange et al. (2022) re-assessed the same SFQ data from Caputo (2019) using a Rasch approach. Although the sample size (N = 90) is quite small for Rasch analysis (as it is for factor analysis), the authors identified potential problems with almost half of all SFQ items. For the depersonalisation factor, three of the four items displayed significant ‘extremity’ bias (i.e., these items were disproportionately easier to endorse for high than low SFQ scorers) suggesting that the depersonalisation factor is confounded by item difficulty bias. Most importantly, both the exploratory factor structure and the Rasch analysis require replication in a larger sample, with an age range that is broader (mean age = 22; range 19–36) and crucially, as Lange et al. (2022) acknowledge, should be extended to see if a comparable factor structure exists for data derived from the far more frequently examined MGT. This latter point is important given that we have shown that the IGDT paradigm induces a significantly greater prevalence of strange face illusions than the MGT.

Hallucinatory and complex illusory experiences are notoriously hard to introspect, assess and measure (see Rogers et al., 2021). Some issues relating to the assessment of strange face illusions stem from the identifying and capturing the fleeting experiences themselves, but others relate to how the measures used might frame the experience. The Strange Face Questionnaire (SFQ), which provides the main formal assessment of the illusions, is directive insofar as it requires participants to interpret their illusions by choosing pre-selected (often Jungian) narratives that may embellish participant responses e.g. Did you see the face of a hero or heroine? Did you see the face of a spiritual person? Did you see the face of a sexually undefined person or an androgyne? Additionally, the SFQ is administered after the experimental task and so participants are reporting on their recollections of their experiences. Given that the MGT has been used to induce dissociation (Brewin and Mersaditabari, 2013Brewin et al., 2013Pick et al., 2020Rugens and Terhune, 2013Shin et al., 2019) and that dissociation induced by the MGT has been shown to immediately impact memory, including visual memory (Brewin and Mersaditabari, 2013Brewin et al., 2013) the reliance on memory, coupled with the requirement to choose a descriptive/narrative approximation, is likely to create significant demand characteristics. Nevertheless, making verbal reports ‘in-the-moment’ may be highly disruptive to both the process and the experience. It is also worth noting that while most studies have recruited participants who are naïve to the tasks (though see Caputo, 2011), priming and expectation effects remain a possible influence. Some evidence suggests that even those who are naïve to sensory-deprivation type paradigms are still able to predict the experience of visual hallucinations in such circumstances (see Jackson Jr & Pollard, 1966). Future studies should therefore examine the possible influence of priming, expectation, and demand characteristics within both the IGDT and the MGT paradigms.

The other commonly used method to assess strange face illusions has been the response-button. Some variability in frequency, duration of illusions and time-to-onset may reflect individual differences in thresholds for decision-making. Equally important is that the single response button is used to subsume a variety of illusory experiences into a single response and so, confounds estimates for feature distortions and new strange faces. In this context, one study (Caputo, 2010b) conducted separate experiments in which two separate participant samples were instructed either in experiment 1 to report “perceptual changes of their own face in the mirror” (p. 1127), (which would include both feature distortions and new strange faces) or in experiment 2 “to respond to new face apparitions“ (p. 1130). Caputo reported that frequency, duration, and time of first apparition did not differ across experiment 1 and 2. However, given that we cannot assess the relative proportions of feature distortions to strange faces that were captured by this single phenomenological measure in experiment 1, we cannot eliminate the possibility that the two experiments are comparing two similar sets of face-related illusion experiences. Future studies should aim to potentially characterise feature distortions versus new face illusions using independent response measures.

Turning to limitations of the current systematic review and meta-analysis. Our assessment of study quality using the AXIS has certain limitations as far as total summed quality scores should be regarded with some caution as individual items are not weighted (Greenland and O'rourke, 2001 Dec 1Greenland and Robins, 1994 Apr 15Jüni et al., 1999 Sep 15). This means that any two studies with the same total AXIS score, but derived from different items, may not be directly comparable as some items may be assessing more vital aspects of quality than other items. We therefore also examined the relative strengths or weaknesses across studies on domains of quality. Another factor that may have impacted the prevalence rates for strange face illusions reported here is that in three studies (e.g., Caputo, 2015Caputo, 2017Caputo, 2019), we relied upon estimates derived from a single item on the Strange Face Questionnaire - using the SFQ “yes” answers to item 5 (Did you see the face of a stranger or unknown person?). Similarly, for feature distortions, we used data derived from SFQ item 1 (Did you see that some facial traits were deformed?). While single items may over-simplify the reported experience, we preferred the use of a single item over multiple items to avoid double counting. For example, a single strange face experience can be registered multiple times on the SFQ (e.g., the illusion was an old person, who looked spiritual, but they had a similar nose to me, and they were of a different ethnicity).

Our review highlights the need to call for external validity given that the reporting rates of new faces (see Fig. 4.) show considerable variability across studies, with a downward trend over time from 100 % (Caputo, 2010a) to 32 % (Derome et al., 2022). Overall, one author (Caputo, who originated the illusion) is an author in almost three-quarters (15 /21) of the studies reported here. Caputo is, to date, the only author who has recorded lux values at the face and also the sole author to employ the IGDT paradigm, both of which are associated with significantly higher reporting of illusions. An important aim of the current review is to encourage wider investigation of the strange face illusion, which we believe has relevance for researchers interested in understanding broader questions relating to perceptual instability, illusions, and hallucinations. Reviewing a decade of primarily phenomenological studies, it would seem pertinent now to move more toward experimentally assessing how environmental and behavioural manipulations impact the phenomenology within this paradigm and even the psychophysics of this complex illusion. Several new findings emerged from our systematic review and meta-analysis. The reporting rate for the illusion was related to demographic (age and gender) and methodological variables (paradigm type, whether lux was measured at the face, IGDT versus MGT paradigms), with overall study quality being the strongest predictor of strange face illusion prevalence. To date, studies have focussed on relatively young, predominantly female samples. These findings indicate that future high-quality studies assessing a wider sampling of participants to include older, more gender-balanced, samples would aid examination of the illusion and its implications. Examining susceptibility to the illusion in older individuals should be contextualised by the fact that poorer low light vision in older individuals may also play a role in increasing rates of strange face illusions. An important new finding from the current review has been that the rates at which the illusion is reported differs significantly across the two paradigms. To date only four studies have employed the IGDT (Caputo, 2013Caputo, 2015Caputo, 2017Caputo, 2019) and so, requires further examination. In this context, the exploratory factor structure (Caputo, 2019) of the SFI questionnaire has been examined only in relation to data derived from the IGDT, and should be extended to see if a comparable factor structure exists for data derived from the far more frequently examined MGT. Finally, it is important to develop a robust method to capture and characterise both the temporal and phenomenological dynamics in such a way that will allow independent assessment of illusory phenomena that appear to be mechanistically distinct (i.e., feature distortions vs new strange faces).