Thursday, December 2, 2021

Rates of positive changes related to therapy varied between 26.6% (relationship to parents) & 67.7% (improvement in depressed mood); deteriorations were related to physical well-being (13.1%), ability to work (13.1%) & vitality (11.1%)

Negative effects of psychotherapy: estimating the prevalence in a random national sample. Bernhard Strauss, Romina Gawlytta, Andrea Schleu and Dominique Frenzl. BJPsych Open, Volume 7, Issue 6, November 2021 , e186. https://doi.org/10.1192/bjo.2021.1025

Abstract

Background: Negative or adverse effects of psychological treatments are increasingly a focus of psychotherapy research. Yet, we still know little about the prevalence of these effects.

Aims: Starting from a representative national sample, the prevalence of negative effects and malpractice was determined in a subsample of individuals reporting psychotherapy currently or during the past 6 years.

Method: Out of an initial representative sample of 5562 individuals, 244 were determined to have had psychotherapy within the past 6 years. Besides answering questions related to treatment, its effects and the therapists, patients filled out the Negative Effects Questionnaire, items of the Inventory of Negative Effects of Psychotherapy reflecting malpractice and the Helping Alliance Questionnaire, and rated psychotherapeutic changes in different areas.

Results: Rates of positive changes related to therapy varied between 26.6% (relationship to parents) and 67.7% (improvement in depressed mood). Deteriorations were most commonly related to physical well-being (13.1%), ability to work (13.1%) and vitality (11.1%). Although patients generally reported a positive helping alliance, many of them reported high rates of negative effects (though not always linked to treatment). This was especially true of the experience of unpleasant memories (57.8%), unpleasant feelings (30.3%) and a lack of understanding of the treatment/therapist (19.3/18.4%). Indicators of malpractice were less common, with the exception that 16.8% felt violated by statements of their therapist.

Conclusions: This study helps to better estimate aspects of negative effects in psychotherapy ranging from deteriorations, specific effects and issues of malpractice that should be replicated and specified in future studies.

Discussion

Based upon the conclusion that our knowledge about negative effects of psychotherapy is still limited,Reference Parry, Crawford and Duggan1Reference Gerke, Meyrose, Ladwig, Rief and Nestoriuc6 one of the unmet needs is sufficient study of the type and quantity of negative effects of psychotherapy under naturalistic conditions. There are several approaches to reach the goal of acquiring more detailed data concerning negative effects. For example, CrawfordReference Crawford, Thana, Farquharson, Palmer, Hancock and Bassett5 approached psychotherapeutic services in England and Wales to survey patients receiving treatment within these services. Although this approach might result in a population close to being representative of psychotherapy patients in a specific health system, it would not be representative of the wider population.

Another approach would be to start by drawing a random sample from a national population and to filter those individuals who had received psychotherapeutic treatment in a certain time period. The latter approach was chosen in a study of Albani et alReference Albani, Blaser, Geyer, Schmutzer, Goldschmidt and Brähler18Reference Albani, Blaser, Geyer, Schmutzer and Brähler20 related to the German population. In contrast to our survey, that study focused on formal characteristics of psychotherapies, patients’ experiences with choosing and finding a therapist, and general figures related to the effectiveness of psychotherapy from the patients’ perspective. In their survey, Albani et al asked only a very small number of questions related to general opinions about the patients’ psychotherapists and did not explicitly focus on negative effects. The sampling method of the Albani et al study probably did not yield a sample representative of psychotherapy patients in Germany. On the other hand, by avoiding direct selection of these patients, the procedure likely resulted in an unbiased sample from which patient experiences could be derived.

So far, data related to the prevalence of psychotherapeutic change, change rates and the occurrence of negative effects are quite variable and do not allow aggregation owing to the different data sources and measures. To add data from a representative population, our study followed the model of Albani et al, selecting individuals from a random national sample of the German population and determining which had been treated with psychotherapy. This resulted in a sample of 244 individuals who were interviewed in detail – in contrast to Albani et al – with a focus on effectiveness, helping alliance and a description of negative effects.

In fact, the resulting sample had quite similar characteristics to those of the German population. The ratio of males to females appeared to be more balanced in our sample than in the Albani study and closer to the distribution of the national population. In a large clinical sample of German out-patients,Reference Altmann, Zimmermann, Kirchmann, Kramer, Fembacher and Bruckmayer30 the percentage of female patients was much higher than in Albani's studyReference Albani, Blaser, Geyer, Schmutzer and Brähler19,Reference Albani, Blaser, Geyer, Schmutzer and Brähler20 (77%), showing that the general population is different from the population using the psychotherapeutic system. Individuals in the under-45 age group were underrepresented whereas those of 45 to 65 years of age were overrepresented in our sample, compared with the general population. Compared with the national population, individuals in our sample had a higher educational level. This probably reflects selective mechanisms of patients’ access to the psychotherapeutic system.Reference Strauß31

Of the initial sample, 7.44% indicated experiences with psychotherapy during the prior 6 years. Although there are no exact estimates of the proportion of individuals seeking psychotherapeutic treatment in Germany, there are some figures for this percentage that can be used to for comparison. Rommel et alReference Rommel, Bretschneider, Kroll, Prütz and Thom32 reported that 11.3% of German females and 8.1% of males over 18 years of age sought psychotherapeutic or psychiatric help over the course of 1 year (Survey Health in Germany). A study of adult health in GermanyReference Rattay, Butschalowsky, Rommel, Prütz, Jordan and Nowossadeck33 reported that 5.3% of females and 3.2% of males between 18 and 79 years of age made use of psychotherapy in the public health system (i.e. attending licensed therapists with reimbursement of the costs by health insurance). Based on these comparative figures, we think that our sample reflects a realistic proportion of psychotherapy users.

Based on the data obtained in our interview study with the final sample of 244 (former) psychotherapy patients, we found a relatively positive evaluation of the therapeutic relationship using the HAQ,Reference Bassler, Potratz and Krauthauser22 which was comparable to that found by other studies. The reports of our sample were generally positive regarding the quality of the working alliance and trust in the therapeutic relationship. At least 80% of all individuals agreed at least to some extent with the positive formulations of the HAQ.

On the other hand, there were some indicators of problems in the therapeutic relationship. One of the most prominent indicators was the report that at least 19% thought that the treatment would not help and ended their therapy prematurely. Also of relevance is the finding that in 24.2% of those cases, the end of treatment was a proposal of the therapist. Although we have no information on whether these were negotiated or unilateral decisions, this finding raises concerns about the lack of participatory decision-making about when to end therapy.

Although we did not use standardised scales that are commonly used to assess treatment outcomes, our data suggest that ‘direct measurements’ of different fields susceptible to psychotherapeutic change indicate improvement rates between 26.6% and 67.6%. The improvement rates of common outcomes (i.e. interactions with others, improvement in depressed mood, personal development), reported by more than 60% of the individuals, particularly demonstrate that the sample might be representative of psychotherapy patients, as similar rates are reported in the research literature.Reference Lambert12

The improvement rates in our sample are also similar to those reported by Albani et al, with respect to both change rates and rates of deterioration as well as differences between single areas of change. However, the improvement rates in the Albani study (with a larger sample) were somewhat higher than those in our sample. For example, an improvement in depressed mood was reported by 67.6% of our sample and by 78.6% in the Albani study. The general evaluations of the treatments were also in line with those reported by Albani et al.

The primary focus of our study was an estimation of negative effects (or side-effects as negative effects paralleling correct treatment in the sense of Linden's classificationReference Linden3) of psychotherapies, with the NEQ as the core instrument. Twenty different negative effects could be attributed to the treatment or to other causes.

The survey results reported in our sample are comparable with those reported in different clinical samples with the NEQ; we found similar results to those of other studies using this method in different samples and psychotherapeutic settings. In a recent study, Rozental et alReference Rozental, Kottorp, Forsström, Mánsson, Boettcher and Anderson2 reported: ‘As for the rate of negative effects, the number of participants reporting negative effects in the current study was 50.9%, consistent with 58.7% among patients in a psychiatric setting who responded to the INEP’.Reference Rheker, Beisel, Kräling and Rief34 However, this number varies significantly between investigations, with rates as high as 92.9% among patients with obsessive–compulsive disorder assessed with the Side-effects of Psychotherapy Scale in a study by Moritz et al,Reference Moritz, Fieker, Hottenrott, Seeralan, Cludius and Kolbeck35 and as low as 5.2% in a national survey by Crawford et alReference Crawford, Thana, Farquharson, Palmer, Hancock and Bassett5 probing for ‘lasting bad effects from the treatment’. Hence, different studies assess a range of negative effects, from transient ‘side-effects’ to lasting harm, making it difficult to compare ratios directly. Even within a subtype of negative effect, different methods of assessment will yield different results, so accurate estimates are not yet available.

Finally, since we had limited resources, we restricted our investigation of malpractice and boundary violations in psychotherapy in this study to only the six items of the INEP. These items form a subscale of the instrument mainly developed to cover side-effects of psychotherapeutic interventions. In general, in our sample, the rates of boundary violations were very low, even lower than one would have estimated from the specific studies in this field. For example, Becker-Fischer and FischerReference Becker-Fischer and Fischer36 reported rates of sexual boundary violations in psychotherapies that were much higher than 5%, whereas in our sample such violations occurred in three of the 244 cases (1.2%).

Strengths and limitations

The main strength of this study was clearly the sampling procedure, which started with a large (>5000) sample representative of the German population and then sought to find individuals disclosing experience with psychotherapy in the German health system, currently or during the past 6 years. We used some of the items from a former survey focusing on more general aspects of psychotherapy and added (parts of) instruments specifically developed to capture negative effects (NEQ) or malpractice (INEP). These additions have shown good psychometric qualities in this and other studies and allow comparisons with other studies or sampling procedures. Compared with other studies, e.g. the Crawford et alReference Crawford, Thana, Farquharson, Palmer, Hancock and Bassett5 survey, we obtained much more detailed results on negative effects as opposed to global ‘lasting bad effects’.

Despite our best efforts, the final sample of 244 was rather small, although it was within the expected range for the use of psychotherapy in the population. Another limitation was the fact that 98 of the 244 participants were surveyed, on average, 2.63 years after completion of their psychotherapy. Of the 244, 139 had already completed their psychotherapy, among whom 98 provided the date of the end of therapy. Thus, the results may have been biased by recall effects. More specifically, there may have been a tendency to only remember adverse aspects of the treatment and neglect the positive ones, or to forget certain unwanted events that occurred several years ago. However, comparisons between those currently undergoing psychological treatment and those remembering their treatment retrospectively yielded only minor differences with respect to both general evaluations of psychotherapy and negative effects.

Moreover, as only 65% of eligible participants accepted the invitation to the interview, the results could be open to selection bias. For example, participants who were unhappy about their treatment might be more (or less) likely to respond to a study on the effects of psychotherapy or might exaggerate negative effects experienced during psychotherapy.

A comparison of demographic data from the recruited sample and the final sample revealed some minor differences regarding age distribution and educational level. However, participants were not recruited only on the basis of potential experiences of negative effects, as positive aspects of treatments were evaluated as well, limiting the risk of selection bias. Also, the response rate in our study was similar to those of other studies on negative effects of psychotherapy, which found rates of 59%Reference Gerke, Meyrose, Ladwig, Rief and Nestoriuc6 and 61%29; it was even much higher than the rate of 19% found in one study.Reference Crawford, Thana, Farquharson, Palmer, Hancock and Bassett5

Our results related to problematic issues such as boundary violations should encourage a detailed examination of patient complaints. So far, these have been mainly reported by certain institutions who serve as receiving agencies for psychotherapy-related complaints.Reference Khele, Symons and Wheeler37,Reference Kaczmarek, Passmann, Cappel, Hillebrand, Schleu and Strauss38

In the future, more research on the prevalence of negative effects would be useful. This would include a more systematic assessment of these effects in clinical trials.Reference Klatte, Strauss, Flückiger and Rosendahl8 It would be interesting to try to recruit a similar sample as that used in our study to estimate the occurrence of more subtle violations of borders and other problematic issues in psychotherapy. According to the studies relating to such complaints, these are much more common than severe ethical problems such as a sexual assault in the treatment room. Addressing such violations and intensifying the more general focus on negative effects would eventually enrich training, supervision and clinical practice with the goal of avoiding harm in psychotherapy.

Asexuals reported lower satisfaction, investment size, & commitment, and higher quality of alternatives, than did allosexual individuals across both romantic relationships & friendships

Asexuality and relationship investment: Visible differences in relationship investment for an invisible minority. Jared Edge, Jennifer Vonk & Lisa Welling. Psychology & Sexuality, Dec 1 2021. https://doi.org/10.1080/19419899.2021.2013303

Abstract: Sexual attraction is a component of most romantic relationships, making it difficult to disentangle from other motives to invest in relationships. Despite the lack of sexual attraction that characterizes asexuality, many self-identified asexual individuals report the desire to enter a romantic relationship. These understudied individuals provide a unique opportunity to study relationship investment in the absence of sexual attraction. We compared relationship investment, a well-established aspect of interpersonal relationships, between asexual (n=139) and allosexual (n=224) individuals. Participants completed a modified Investment Model Scale, which examined satisfaction, quality of alternatives, investment size, and commitment in romantic relationships and friendships. Contrary to our prediction that asexual individuals would invest less than allosexual individuals in romantic relationships, but not in friendships, they reported lower satisfaction, investment size, and commitment, and higher quality of alternatives than did allosexual individuals across both types of relationships. Although lack of sexual attraction could explain lower investment scores in romantic relationships for asexual individuals, some other effect may be responsible for reported differential investment in friendships.

Keywords: Sexual OrientationAsexualityRomantic relationshipsFriendshipsRelationship Investment


We do not seem more responsive to evolutionary-based threats: The results of behavioral experiments pose a challenge to established theories, as they show faster reaction time to modern threats, which is the opposite of what some evolutionary theories predict

Snakes vs. Guns: a Systematic Review of Comparisons Between Phylogenetic and Ontogenetic Threats. Soheil Shapouri & Leonard L. Martin. Adaptive Human Behavior and Physiology, Dec 2 2021. https://link.springer.com/article/10.1007/s40750-021-00181-5

Abstract

Objectives: The potential differences between phylogenetic threats (e.g., snakes) and ontogenetic threats (e.g., guns) can have a wide-ranging impact on a variety of theoretical and practical issues, from etiology of specific phobias to stimulus selection in psychophysiological studies, yet this line of research has not been systematically reviewed.

Methods: We summarize and synthesize findings from fear conditioning, illusory correlation, attention bias, and neuroimaging studies that have compared these two types of threats to human survival.

Results: While a few brain imaging studies reveal preliminary evidence for different brain networks involved in the processing of phylogenetic and ontogenetic threats, attention bias studies tentatively show faster reaction time for modern threats, illusory correlation bias is evident for both types of threats, and fear conditioning studies are far from conclusive.

Conclusions: The results of behavioral experiments, especially attention bias research, pose a challenge to established theories like biological preparedness and fear module, as they show faster reaction time to modern threats, which is the opposite of what some evolutionary theories predict. We discuss the findings in terms of other theories that might explain the same results and conclude with potential future directions.


Although discredited, the five stages of grief is mentioned in 60pct of sites included a description (from brief mention to detailed elaboration) of the those stages

Stages of Grief Portrayed on the Internet: A Systematic Analysis and Critical Appraisal. Kate Anne Avis, Margaret Stroebe and Henk Schut. Front. Psychol., December 2 2021. https://doi.org/10.3389/fpsyg.2021.772696

Abstract: Kübler-Ross’s stage model of grief, while still extremely popular and frequently accepted, has also elicited significant criticisms against its adoption as a guideline for grieving. Inaccurate portrayal of the model may lead to bereaved individuals feeling that they are grieving incorrectly. This may also result in ineffectual support from loved ones and healthcare professionals. These harmful consequences make the presentation of the five stages model an important area of concern. The Internet provides ample resources for accessing information about grief, raising questions about portrayal of the stages model on digital resources. We therefore conducted a systematic narrative review using Google to examine how Kübler-Ross’s five stages model is presented on the internet. We specifically examined the prominence of the model, whether warnings, limitations and criticisms are provided, and how positively the model is endorsed. A total of 72 websites were eligible for inclusion in the sample. Our analyses showed that 44 of these (61.1%) addressed the model, indicating its continued popularity. Evaluation scores were calculated to provide quantitative assessments of the extent to which the websites criticized and/or endorsed the model. Results indicated low criticalness of the model, with sites often neglecting evaluative commentary and including definitive statements of endorsement. We conclude that such presentation is misleading; a definitive and uncritical portrayal of the model may give the impression that experiencing the stages is the only way to grieve. This may have harmful consequences for bereaved persons. It may alienate those who do not relate to the model. Presentation of the model should be limited to acknowledging its historical significance, should include critical appraisal, and present contemporary alternative models which better-represent processes of grief and grieving.

Discussion

Principal Findings

The purpose of this study was to gain better understanding of the presentation of Kübler-Ross’s five stages model on the internet. The concern to examine inclusion of the model on websites arose in large part from its critical assessment in scientific reviews and in the accounts of clinicians. Notably, scientific sources have drawn attention to the absence of a body of empirical research and lack of validity regarding the model. Clinicians have pointed to potential negative consequences for bereaved people who do not “conform” by going through the stages but who think that they should be experiencing them. In the face of these criticisms, it is important to explore how the model is presented to professionals and lay people in general, and to bereaved persons in particular. Technological advances have meant that the internet system is widely used for the giving to and seeking of support among bereaved persons, providing ample resources for accessing information about grief. This raises questions about the portrayal of the stages model through websites. We therefore conducted a systematic narrative review to examine the presentation of the five stages model of grief on the internet, investigating three research questions.

Our first research question addressed the prominence of the model; how frequently is it mentioned on websites providing information about grief? The results indicated the continued popularity of the model; 61.1% of websites included a description of the five stages, with accounts varying from brief mention to detailed elaboration of the model. This is a conservative estimate, given a further nine sites mentioned “stages” in general, indicating the possibility that nearly three quarters of all the sites referred to the model, at least non-specifically. This frequent inclusion is in line with Corr (20182019a) research results; the five stages were described in the majority of his sampled textbooks. Similarly, it seems to echo Sawyer et al. (2021) findings mentioned earlier, that roughly 68% of the general public and 44% of mental health professionals endorsed the stages. Furthermore, an exploration of the word count providing information about the five stages also highlighted the prominence of the model, with over a third of the sites devoting 50% or more of their word count to the stages. Taken together, these results raise the question why there is such continued attention to the model, especially given that there have also been notable criticisms. The popularity of the model may stem from its ability to create order during a time of complexity, resulting in a positive narrative where one prevails over the despair of grief, culminating in the final stage of acceptance. The following quote cited on one of the reviewed websites encapsulates this: ‘Stage theories “impose order on chaos, offer predictability over uncertainty, and optimism over despair”’ (Shermer, 2008, p. 6). However, as the same website goes on to conclude, the appeal of the stages model in creating a narrative of hope is not equivalent to scientific importance: “Stages are stories that may be true for the storyteller, but that does not make them valid for the narrative known as science” (p. 9).

While our results showed that the five stages were mentioned frequently, closer examination of the data suggests differences in the portrayal of the stages between the included domain extensions. In particular, Dutch domain extensions appeared to refer to the model less frequently than the other domain extensions. This finding suggests that different countries may regard the model differently. The reasons behind these apparent differences are unclear, but one could speculate that a multitude of cultural and structural factors could play a role such as: underlying societal beliefs about death and dying, quality and quantity of educational programs providing information about issues surrounding grief, and ease of information accessibility, for example, to alternative models of grief.

Our second research question pertained to how the model was evaluated; what warnings, limitations and criticisms concerning the model were provided on the sites? Our exploration indicated that the most frequent types of warnings were those cautioning against the rigidity of the model, particularly nearly 60 percent of sites included warnings that the stages are non-linear and a half of the sites cautioned that not all five stages have to be experienced. This type of evaluation is also consistent with Corr (20182019a) analyses, which established that non-linearity and not having to experience all stages were the most commonly mentioned critiques in his sample of textbooks. However, close examination of these types of warnings showed that they often lack a critical stance, endorsing the existence of the model by giving the message that one will/should experience the stages, just not in a rigid manner with the five stages following on in a strict order. Moreover, as critics have pointed out, the word “stages” itself implies rigidity, such that warning against rigidity actually presents a confusing message. This is one of the model’s most contentious features, with proponents using non-linearity to underline the model’s broad interpretation possibilities and therefore wider application, while opponents have argued that it disqualifies the model. Friedman (2009) made the latter point on one of the websites in our sample:

We [have] compared the stages of a butterfly to the alleged stages of grief, to show the problem with any stage theories of grief. To wit: Stages in order to be called stages must go through an orderly progression, each and every time. Starting as an egg, a potential butterfly must go through the four stages Egg, Caterpillar (Larva), Pupa (Chrysalis) Adult (Imago). It cannot elect to skip the larval stage and jump right over to the pupal stage.

Elisabeth Kübler-Ross herself constantly stated that the stages didn’t all happen and not necessarily in order, if at all. We just can’t find a way to use the idea of stages which really are absolute—see Butterfly reference—for something as variegated as human grief (p. 11).

In addition to warnings of rigidity, our analysis established that a number of warnings of existence, limitations and criticisms of the five stages model were sometimes included on some of the websites, albeit very infrequently (the mean score for criticalness was 1.9 out of a possible total score of twelve points). The fact that a large portion of websites lacked any critical appraisal highlights concerns about the representation of the model, particularly with regard to the lack of evidence and the potential for harm. These concerns should give one pause for reflection about the use of the five stages model as a contemporary guideline for bereaved.

Our final research question explored how the model was endorsed; how positively was it presented on the websites? Our analysis uncovered a number of different types of endorsements, which was defined in our study as statements showing support or approval of the five stages model. The most frequent endorsements were definitive statements (statements of unconditional approval) regarding the existence of the stages. As our results showed, the concerns we mentioned above were again confirmed. The high frequency of definitive statements about the stages’ existence is of considerable significance, since it suggests that the stages are an actuality; wrong conclusions about the validity of the five stages can easily be drawn by those accessing certain websites. The concern that this can have potentially harmful consequences for bereaved persons remains. The definitive endorsement of many sites on the internet can easily be interpreted as conveying the message that those who do not experience the stages are grieving incorrectly. As indicated earlier, advertising these stages as a certainty for bereaved people is unfounded. The implications of uncritical acceptance of the five stages model should not be underestimated; as one of the authors of our sampled websites cautions:

As we have pointed out in past articles, Kübler-Ross defined these “phases” as those experienced by a person dealing with the diagnosis of a terminal illness, and not as stages faced by someone who has faced a significant emotional loss. This misconception of their intended purpose has frustrated many grievers who felt that failure to progress through them could leave them forever in misery (Moeller, 2017, p. 6).

Furthermore, a definitive portrayal of the model can result in ineffectual support from loved ones or healthcare professionals. Insights from research on social and group norms have shown that violation of norms can lead to negative emotional reactions like anger or blame (Ohbuchi et al., 2004Stamkou et al., 2019) as well as forms of social sanctions and punishment (Fehr and Fischbacher, 2004Falk et al., 2005Peters et al., 2017). A loved one or healthcare professional may, therefore, react in a negative way if they feel that the bereaved individual is violating the norm by not going through the stages. These reactions could result in bereaved people feeling alienated, an implication that is particularly worrying given that various studies have demonstrated the protective effect of social support in preventing negative effects in bereaved individuals (e.g., Hibberd et al., 2010Çakar, 2020Chen, 2020). Bereaved people themselves may also feel that there is something wrong with them for not grieving in line with the norm and may seek therapy to help move through the stages and grieve in the “correct” way. These endeavors may be unnecessary, especially considering that psychological interventions appear to be hardly or not effective for the bereaved population for whom there is no other indication (yet) than that they have lost a significant person (Schut et al., 2001Wittouck et al., 2011). To put it concisely, presenting the five stages model in an uncritical and definitive light could lead to the belief that those who do not experience the stages are abnormal, a misconception which has important implications and potential harmful consequences for bereaved individuals.

In general, results showed low criticality with sites which often included definitive statements of endorsement neglecting such warnings. Our conclusion is that the model is not being accurately portrayed to bereaved people, with the dangers of using it as a contemporary guideline largely being ignored.

Limitations of This Analysis

Limitations of this analysis need to be addressed. First, we noted the gap in time between the selection and analyses of the websites. While the majority of the sites were still operational when the data were analyzed (and, therefore, still relevant and accessible to the public as currently as 2020), an updated analysis could give insight into recent trends concerning the portrayal of the five stages model. This would be especially interesting in light of the recent corona pandemic. Many noteworthy questions have arisen regarding how the portrayal of grief has changed as a result of COVID-19, including ones about the application of theoretical approaches (cf., Stroebe and Schut, 2020). An analysis of information on grief-related websites subsequent to the current pandemic would add further insights into how understandings of grief have changed following COVID-19. For example, one relevant question in the context of our study is whether the sites have continued to advocate the five stages model under these changed circumstances.

Another limitation relates to the restriction to English and Dutch language websites. While this analysis ensured that both developing countries and non-English sites were represented, one avenue for future research could be to include more country-specific domain extensions, in order to achieve further representation of different cultures and languages and establish the influence of the five stages model in other parts of the world.

Additionally, an important limitation of this study has to do with the review process itself. Analysis of written text can lend itself to subjective interpretation (Given, 2008, p. 120–122). Certain warnings, limitations, critiques and endorsements were, for example, worded more implicitly than others, making them open to interpretation. An example of this is seen in the following text taken from one of our websites: “You may go back and forth between them or skip one or more stages altogether” (What is normal grieving WebMD, n.d., p. 4). While the text is not explicitly stating that the stages are non-linear, the phrase “back and forth” could be interpreted as implicitly implying non-linearity. An analysis of the researchers’ thought processes behind the determination of the different criticisms and endorsements revealed that while there was often agreement concerning the presence of a criticism or endorsement on a website (interrater agreement was nearly 96 percent), there was occasional disagreement about the exact statement representing these criticisms and endorsements. One possible explanation for this is that websites may possess multiple phrasing of the same premise, resulting in certain statements resonating with a particular individual more than others, but culminating in overall agreement of the message of the website. However, such differences occurred with too little frequency for the patterns of results to be affected.

Finally, two additional avenues for future research should be considered. Firstly, while the use of quantitative data was deemed appropriate for this study, future studies incorporating qualitative data may add additional insights (e.g., qualitative research may be better-able to establish whether the overall thrust of the five stages presentation in the website is endorsing, while only “lip service” is paid to criticisms). Furthermore, future research could play an important role in further validating the scoring system used in this study in the context of both digital and non-digital informational resources.

Rolf Degen summarizing... Many of the features that serve as cues for old age are also signs of masculinity, and female faces are perceived as more masculine as they become older

How facial aging affects perceived gender: Insights from maximum likelihood conjoint measurement. Daniel Fitousi. Journal of Vision November 2021, Vol.21, 12. https://doi.org/10.1167/jov.21.12.12

Abstract: Conjoint measurement was used to investigate the joint influence of facial gender and facial age on perceived gender (Experiment 1) and perceived age (Experiment 2). A set of 25 faces was created, covarying independently five levels of gender (from feminine to masculine) and five levels of age (from young to old). Two independent groups of observers were presented with all possible pairs of faces from this set and compared which member of the pair appeared as more masculine (Experiment 1) or older (Experiment 2). Three nested models of the contribution of gender and age to judgment (i.e., independent, additive, and saturated) were fit to the data using maximum likelihood. The results showed that both gender and age contributed to the perceived gender and age of the faces according to a saturated observer model. In judgments of gender (Experiment 1), female faces were perceived as more masculine as they became older. In judgments of age (Experiment 2), young faces (age 20 and 30) were perceived as older as they became more masculine. Taken together, the results entail that: (a) observers integrate facial gender and age information when judging either of the dimensions, and that (b) cues for femininity and cues for aging are negatively correlated. This correlation exerts stronger influence on female faces, and can explain the success of cosmetics in concealing signs of aging and exaggerating sexually dimorphic features.

General discussion
I find that facial gender and age are not perceived independently of each other. For 14 of 16 observers, judgments of gender (or age) were contaminated by age (or gender) according to a saturated observer model. Generally, the results suggest that female faces are perceived as more masculine as they become older; and young faces (age 20 and 30) are judged as older as they become more masculine. Why do aging and gender interact? The answer is rooted in the perceptual structure of the faces themselves. Perception of facial gender and age rely on shape and texture cues (Brown & Perrett, 1993Bruce & Langton, 1994Burton et al., 1993). The correlations between these phenotypic aspects can be readily demonstrated in our set of synthetic face stimuli (Figure 1), and they are likely present in real faces.3 Facial aging is conveyed by morphological cues (Berry & McArthur, 1986Burt & Perrett, 1995George & Hole, 2000O’Toole et al., 1997) such as a) an increase in the size of the jaw, b) thinning of the lips, and c) an increase in the distance between the eyebrow and the eyes. Textural cues for aging particularly affect the skin: a) skin tone becomes darker, b) it has more wrinkles, c) its luminance contrast decreases, and d) its pigmentation becomes yellower. Many of these shape and texture cues also signal masculine features (Brown & Perrett, 1993Bruce & Langton, 1994Burton et al., 1993Russell, 2009). Men have bigger jaws, their lips are thinner, and their eyebrows are closer to their eyes than females; moreover, they tend to have darker skin with lower contrast (Russell, 2010Tarr et al., 2001). The upshot is that many of the features that serve as cues for old age are also signs of masculinity. The current study shows that cues for age and for gender have the strongest interactive influence when faces are either young or feminine. 
A comment is in order regarding the relations between skin lightness and gender in the current experimental setting. Despite my great efforts to eliminate the correlation between skin lightness and gender, feminine skin tone created by FaceGen were slightly lighter than masculine faces. One may argue that this undermines the current conclusions because observers could have used skin lightness as a cue for gender. However, one should note that such a confound may reflect an ecologically valid cue because a) in real population female skin reflectance is 2 to 3 percentage points above that of male skin (Rahrovan et al., 2018), and b) FaceGen relies on a representative sample of real people (Inversions, 2008). Moreover, a study by Macke and Wichmann (2010) also attempted to remove textural cues for gender (including lightness), but it seems that these authors could not prevent this built-in confound. In their caption to their Figure 1 they admit that: “For some men with strong beard growth, like the gentlemen in the rightmost column, this meant that there was a slightly darker region around the mouth – at least from an introspective point of view a reasonable cue to gender” (p. 6). The upshot is that it is difficult to equate experimentally the skin lightness of feminine and masculine faces due to natural differences. Future studies may be able to circumvent this confound, but then an issue may arise as to whether such faces reflect the statistical structure of real-world faces. 
Evolution, cosmetics, and facial aging
From an evolutionary stand point, the current findings make sense. Fertility in young females may be signaled by cues for femininity and cues for age. The correlation between the two types of cues lead to informational redundancy that increases the chances that information about fertility is transmitted efficiently and correctly to potential mates. This idea can also explain the success of cosmetics (Russell, 2010) and its higher prevalence among women (Etcoff et al., 2011Russell, 2009). Sexual attractiveness and anti-aging are two main goals of the cosmetics industry, and the current study can explain why. Signals of femininity are positively correlated with attractiveness (O'Toole et al., 1998), and as we have shown here are also negatively correlated with age. This finding can explain the biological incentive for using cosmetics to highlight sexually dimorphic attributes of femininity, but also to conceal cues for old age. Both serve as signals of fertility and are expressed on the same facial cues. For example, Russell (2009) demonstrated the existence of a sex difference in facial contrast that affects the perception of gender. Females have greater luminance contrast between the eyes, lips and the surrounding skin than men. Russell (2009) showed that cosmetics consistently increase facial contrast and thus are functioning to exaggerate feminine features and consequently their attractiveness. Notably, skin contrast also differs between young and old faces and serves as a cue for age (Berry & McArthur, 1986Burt & Perrett, 1995George & Hole, 2000O’Toole et al., 1997). Lower levels of contrast signal old age. Thus, cosmetics not only exaggerates sexually dimorphic attributes, but also decreases perceived age. Etcoff et al. (2011) found that the influences of cosmetics go even farther than that, exerting dramatic positive effects on judgments of competence, likability, and trustworthiness. 
Nonveridical perception of facial gender and age
The present study reveals that facial age and gender are not perceived veridically, but are subjected to major influences of context. Context here refers to the contamination of each dimension by the other. In this sense, each face has a specific gender (age) level that sets a unique context for the perception of its age (gender). This finding is in accordance with the mentioned effects of cosmetics on perceived gender (Etcoff et al., 2011Russell, 2009), and also with several recent adaptation studies that found that the appearance of both age (O’Neil & Webster, 2011) and gender (Schweinberger et al., 2010) can be altered through adaptation to a previous face. For example, a neutral-gender face seems to be male after adaptation to a female face (Schweinberger et al., 2010). Similarly, adapting to an old face causes faces of intermediate age to appear younger (O’Neil & Webster, 2011). These context effects imply that the internal representations that govern facial age and gender are dynamic and are sensitive to previous experience and correlational structures in the faces themselves. I have recently proposed a ‘face file’ approach to face recognition (Fitousi, 2017a2017b), which assumes that faces are stored as temporary episodic representations with detailed featural information about the face’s gender, age, identity, and emotion. These features are bound to each other (e.g., male+young) and can be updated momentarily. Face files can be used to account for the context-dependent nature of facial age and gender (Fitousi, 2017a2017b). 
Age and gender are essential for what social scientists call person ‘construal’ (Bodenhausen & Macrae, 1998Fiske & Neuberg, 1990Freeman & Ambady, 2011Macrae & Bodenhausen, 2001), the process by which social agents construct coherent representations of themselves and others. These representations are used by observers to guide information processing and information generations towards others. According to the dynamic interactive model by Freeman and Ambady (2011) the initial presentation of a face launches simultaneous activation of several competing social categories (e.g., age, gender, race). Along the accrual of evidence, the pattern of activation gradually sharpens into clear interpretation (young female), while other alternatives are inhabited. According to this framework, a confluence of perceptual (bottom-up) and cognitive–social (top-down) factors can generate various types of interactions among social facial dimensions such as facial age and gender. The dynamic–interactive model can account for a large body of research that has documented interactive patterns in face categorization including the current findings (Cloutier et al., 2014Freeman et al., 2012Johnson et al., 2012). One crucial goal made explicit by the dynamic-interactive model is the need to distinguish between lower (perceptual) and higher (stereotypes, attitudes, expectations) sources of bias in face categorization (Becker et al., 2007). The former are yielded by correlated phenotypic traits in the sensory cues themselves (skin texture and cues for age), whereas the latter are generated by learned associations or social expectations that can be located in the ‘head’ of the observer. 
The integral/separable distinction and MLCM
The application of the MLCM approach (Knoblauch et al., 2014) to psychological dimensions raises a caveat concerning a more general issue in psychology—the concept of perceptual independence. Garner proposed a fundamental distinction between integral and separable dimensions (Garner, 196219701974a1974b19761991). This distinction is a pillar of modern cognitive science (for review see Algom & Fitousi, 2016). Objects made of integral dimensions, such as hue and saturation are perceived in their totality and cannot be readily decomposed into their constituent dimensions. Objects made of separable dimensions, such as shape and color, can be readily decomposed into their constituent dimensions. The integral–separable distinction cannot be decided based on the verdict of only one procedure. There is the risk that a theoretical concept (e.g., separability) would be only a restatement of the empirical result (Fitousi, 2015Von Der Heide et al., 2018). 
To avoid circular reasoning, Garner has noted the need for converging operations (Garner et al., 1956). Several methodologies have been used to support the integral–separable distinction: a) Garner’s speeded classification task (Garner, 1974b), b) similarity scaling (Attneave, 1950Melara, 1992), c) information theory (Fitousi, 2013Garner, 1962Garner & Morton, 1969), d) general recognition theory (GRT Ashby & Townsend, 1986Fitousi, 2013Townsend et al., 2012Maddox & Ashby, 1996), and e) system factorial technology (SFT Townsend & Nozawa, 1995). Take method b) for example, in which observers are asked to rate the similarity of two objects (Hyman & Well, 1967). It has been often found that for integral objects similarity is computed according to a Euclidian distance metric, and for separable objects similarity is computed according to a city-block distance metric (Melara, 1992). It has also been shown that the outcome from the similarity procedure accords well with the Garner task results (Algom & Fitousi, 2016). 
Recently, Rogers et al. (2016) and Rogers et al. (2018) have proposed that the MLCM can be used as a converging operation on the notion of integrality–separability. A case in point in their studies is the color dimensions of chroma and lightness (Munsell, 1912). In the Garnerian tradition, these dimensions are considered as classic integral dimensions: a) they produce Garner interference (Garner & Felfoldy, 1970) and b) they obey a Euclidian distance metric in similarity scaling (Burns & Shepp, 1988). If indeed the dimensions are dependent in processing, then an additive or saturated observer MLCM models should best describe the data. Rogers et al. (2016) found that the additive observer model best described the data. Lightness negatively contributed to perception of chroma for red, blue, and green hues (but not for yellow). These results are important because they demonstrate the utility of the MLCM in providing converging evidence on the notion of integrality–separability, and in identifying the internal representations that govern color dimensions. They are also highly informative in uncovering the specific pattern of dimensional interaction. One would have expected integral dimensions to be best fitted by saturated observer model rather than additive observer model. Hence, the application of multiple related methodologies to investigate questions of perceptual independence is of great practical and theoretical importance in sharpening and explicating our concepts. 
The Garnerian edifice is rich in theoretical insights that can illuminate issues in MLCM, and vice versa. This can lead to a cross-fertilization of both methods. For example, an important caveat raised in the Garnerian tradition concerns the direction of interaction between a pair of dimensions. Integrality is not a symmetric concept. Dimension A can be integral with dimension B, while dimension B can be separable from dimension A (Fitousi & Algom, 2020). This notion can be readily applied to studies in MLCM. When judging dimension A and ignoring dimension B, observers can exhibit a complete independent observer model. However, when judging dimension B and ignoring dimension A, observers can exhibit an additive or saturated observer model. Moreover, Garnerian theorizing highlights the role of relative discriminability in determining the direction of asymmetry (Melara & Mounts, 1993). Often the more discriminable dimension intrudes on the less-discriminable dimension (Fitousi & Algom, 2006). It has been shown that relative discriminability can be altered by the researcher and determine the direction of interaction. Therefore, to provide a fair test of independence the dimensions should be equally discriminable (Algom et al., 1996). These factors might also be important in MLCM modeling. 
Future work should test in detail the exact relations between notions of integrality–separability in the Garner tradition and the notions of MLCM. It is not immediately clear for example, that independence in the two approaches is the same. When the dimensions of facial age and gender were subjected to the Garner test Garner (1974b) by Fitousi (2020), they were found to be separable dimensions. But the application of the MLCM to the same dimensions supported their dependency. Why age and gender can appear as separable dimensions in the Garner paradigm and as integral or interactive dimensions (Algom et al., 2017Algom & Fitousi, 2016) in the MLCM? The solution to this caveat comes by assuming that perceptual independence is not a unitary concept, but rather a nomenclature pointing to various types of independence (Ashby & Townsend, 1986Fitousi, 20132015Fitousi & Wenger, 2013). This idea has been originally developed by Garner and Morton (1969) and Ashby and Townsend (1986). It seems that conjoint measurement gauges different types of independence than the Garner paradigm. Future studies may be able to understand the relations between these two approaches.