Tuesday, December 17, 2019

Religiosity modestly increased 2006-2011 & modestly decreased 2011-2017; people's psychological & physical health in religious countries is particularly good once economic development is accounted for

Joshanloo, M., & Gebauer, J. E. (2019). Religiosity’s nomological network and temporal change: Introducing an extensive country-level religiosity index based on Gallup World Poll data. European Psychologist, Dec 2019. https://doi.org/10.1027/1016-9040/a000382

Abstract: Countries differ in their religiosity and these differences have been found to moderate numerous psychological effects. The burgeoning research in this area creates a demand for a country-level religiosity index that is comparable across a large number of countries. Here, we offer such an index, which covers 166 countries and rests on representative data from 1,619,300 participants of the Gallup World Poll. Moreover, we validate the novel index, use it to examine temporal change in worldwide religiosity over the last decade, and present a comprehensive analysis of country-level religiosity’s nomological network. The main results are as follows. First, the index was found to be a valid index of global religiosity. Second, country-level religiosity modestly increased between 2006 and 2011 and modestly decreased between 2011 and 2017—demonstrating a curvilinear pattern. Finally, nomological network analysis revealed three things: it buttressed past evidence that religious countries are economically less developed; it clarified inconsistencies in the literature on the health status of inhabitants from religious countries, suggesting that their psychological and physical health tends to be particularly good once economic development is accounted for; and finally, it shed initial light on the associations between country-level religiosity and various psychological dimensions of culture (i.e., Hofstede’s cultural dimensions and country-level Big Five traits). These associations revealed that religious countries are primarily characterized by high levels of communion (i.e., collectivism and agreeableness). We are optimistic that the newly presented country-level religiosity index can satisfy the fast-growing demand for an accurate and comprehensive global religiosity index.

Concluding Remarks

Countries differ widely in their degree of religiosity, and those country-level differences qualify numerous psychological effects that have been deemed universal (e.g., psychological health benefits of religiosity – Gebauer, Sedikides, et al., 2012; psychological health benefits of income – Gebauer, Nehrlich, et al., 2013; life satisfaction benefits derived from affective experience – Joshanloo, 2019). Hence, there is a great demand for a valid index of country-level religiosity that is available for a large number of countries. The present study provided such an index. The index is based on representative data from 1,619,300 individuals and spans 166 countries worldwide. We made use of that index to examine temporal change in worldwide religiosity over the last decade and to gain a more complete understanding of country-level religiosity’s nomological network. We found strong evidence that our country-level religiosity index is a valid and robust index of global religiosity. More precisely, its correlation with an external four-item index of global religiosity was near-perfect.10 Moreover, we randomly divided the full GWP sample into two independent subsamples and used those subsamples to compute two additional country-level religiosity indices. Those two entirely independent country-level religiosity indicators, too, correlated near-perfectly with the external country-level index of global religiosity. That finding speaks for the robustness of our index. The two independent country-level religiosity indices also demonstrated a near-perfect correlation, a finding that attests to the validity and reliability of our index. In addition, we estimated the worldwide temporal change in religiosity between 2006 and 2017. We found a quadratic trajectory (Figure 2). For the year 2006, we estimated that 73.039% of the world population considered religiosity an important part of their daily lives. From 2006 to 2011, we found an increase in worldwide religiosity levels. By 2011, we estimated the highest level of religiosity worldwide – 74.044% of the world population considered religiosity an important part of their daily lives. Finally, from 2011 to 2017, we found a decrease in worldwide religiosity. By 2017, the worldwide level of religiosity was (descriptively) lower than in 2006 – 72.725% of the world population considered religiosity an important part of their daily lives. Additional analyses identified a subset of 101 countries which drove the just-described quadratic trajectory (Figure 3). Finally, we estimated the worldwide temporal changes in Christianity and Islam between 2008 and 2017. We found a small linear decline in Christianity from 41.753% in 2008 to 40.582% in 2017 and no significant change in Islam within the same period of time (25.484% in 2008 and 25.508% in 2017). The small temporal change over the last decade buttressed our decision to base our country-level religiosity index on the cumulative data from 2005 to 2017. A detailed analysis of why we observed the just-described pattern of temporal change is beyond the scope of the present work, but it certainly is an interesting and timely topic for future research. We acknowledge that the temporal patterns we discovered in the present analyses may be subject to unexpected changes in the near future, and at present, it does not seem possible to predict the future of worldwide religiosity.

Moreover, the present research includes the most complete analysis of country-level religiosity’s nomological network ever conducted, involving 36 external variables. First, in replication of much previous research, country-level religiosity was associated with lower economic development. It is noteworthy that previous research typically capitalized on a single indicator of economic development, whereas we used several complementary indicators and found highly convergent results. Second, we examined the association between country-level religiosity and a large array of country-level health indicators (psychological health, physical health, and health security). The results were less consistent than in the case of economic development. However, on the whole it seems fair to conclude that country-level religiosity was mostly negatively related to health, but that negative relation was driven almost entirely by the poor economic conditions in most religious countries. In fact, after accounting for country-level differences in economic conditions, religious countries were by and large healthier than non-religious countries. Mediation analyses were largely consistent with Oishi and Diener’s (2014) proposal that the health benefits of country-level religiosity are partly due to higher levels of purpose in life in religious countries. Finally, little has been known about the associations between country-level religiosity and psychological dimensions of culture (i.e., Hofstede’s cultural dimensions and country-level Big Five traits). We found that religious countries primarily differed from their non-religious counterparts on dimensions that belong to the fundamental communion dimension (i.e., collectivism and agreeableness; Abele & Wojciszke, 2014). This finding squares with the high importance that all world religions place on communal values and norms (Gebauer, Paulhus, et al., 2013; Gebauer, Sedikides, & Schrade, 2017). Overall, the nomological network analysis conducted here buttressed previous research, increased the confidence in our country-level religiosity index, and expanded our understanding of country-level religiosity.

In conclusion, the present research introduced and validated the most extensive country-level religiosity index to date, examined its worldwide temporal trajectory over the last decade, and clarified its nomological network. We (optimistically) hope that our country-level religiosity index will be helpful for the large and fast-growing community of scholars interested in the powerful role of country-level religiosity for human thought, feelings, and behavior.

Voluntary castration: They felt penile removal made them more physically attractive; & at least two individuals thought that a penectomy would make them a better submissive sexual partner

Characteristics of Males Who Obtained a Voluntary Penectomy. Erik Wibowo, Samantha T. S. Wong, Richard J. Wassersug & Thomas W. Johnson. Archives of Sexual Behavior, Dec 16 2019. https://link.springer.com/article/10.1007/s10508-019-01607-8

Abstract: We report here on survey data from 11 genetic males, who had voluntary penectomies without any explicit medical need, yet did not desire testicular ablation. This group was compared to a control group of men who completed the same survey but had no genital ablation. The penectomy group was less likely to identify as male than the control group. They were also more likely to have attempted self-injury to their penis (at a median age of 41.5 years), been attracted to males without penises, and felt that they were more physically attractive without a penis than the controls. Motivations for voluntary penectomy were aesthetics (i.e., a feeling that the penile removal made them more physically attractive) or eroticism (i.e., at least two individuals thought that a penectomy would make them a better submissive sexual partner). In terms of sexual function, the penectomized and control groups reported comparable sexual function, with six penectomized individuals claiming to still be able to get and keep an erection, suggesting possible incomplete penile ablation. In their childhood, penectomized individuals were more likely than the controls to have pretended to be castrated and to have involved the absence of genitals of their toys in their childhood play. We discuss characteristics and sexual outcomes for individuals who have had a voluntary penectomy. A future study with a larger sample size on men who desire penectomies is warranted.


In this study, we compared various self-reported data from
genetic males, who elected voluntary penectomy, with data
from men with intact genitals, who did not express any
desire for genital removal. We found some differences
between the two groups, in that penectomized men were
more likely (1) to identify as non-male for their gender, (2)
to have attempted self-genital injury, (3) to be attracted to
males without penises, (4) to feel attractive without penis,
(5) to have pretended to be genital-less in their childhood
and (6) to have involved the absence of genitals of their toys
in their childhood play. Other psychosexual outcomes that
can be affected by hormonal levels such as sexual function,
depression and anxiety were comparable between the two

Psychiatric Condition
From this study, a majority (8 out of 11) of the penectomized individuals
felt attractive without a penis and this feeling may have
contributed to their desire for a penectomy. In addition, at least
two individuals thought that the absence of penis allowed them
to be a better submissive sexual partner, i.e., their penectomy
desire was sexually motivated. However, we cannot explicitly
conclude if the penectomized individuals in our study have a
psychiatric condition without psychiatric evaluation. We asked if
they had been diagnosed for any medical conditions, and only two
answered, one with inflammatory bowel disease and the other
with anxiety/depression. Neither alluded to a major psychiatric
disorder. Collectively our data suggest that there are two motivations
for desiring a penectomy in our sample. Some individuals
feel that their penis is not part of their body and they may have
body integrity dysphoria. Others have a paraphilic motivation,
where they eroticize not having a penis as making them a better
submissive sexual partner.
Early Life Experience
One unexpected finding from this study was that most of
the penectomized individuals grew up in a medium-large
city, had not observed animal castration nor had been threatened
with castration in their childhood. This is in contrast
to data on individuals with an orchiectomy, many of whom
were raised on a farm, had reported participating in animal
castration and were threatened with castration during their
childhood (Vale et al., 2013). However, the Vale et al. study
included predominantly men who had just an orchiectomy,
with only 7.5% of the men penectomized. We compared data
from the penectomized men in the Vale et al. study and the
current study, but no difference was found in terms of living
conditions during childhood. The fact of being raised
in a populous location, unlike on a farm, may explain why
the penectomized individuals had never witnessed animal
castration. These findings suggest that the etiology for voluntary
penectomy and voluntary castration are likely to be
different. It remains unknown to what extent social setting
and population density influenced interest in penectomy for
these individuals.
Among those who had played with male toy figurines
that lacked external genitalia, the penectomized individuals
were more likely to notice and incorporate the absence of
genitals in their play than non-penectomized individuals.
Although our sample size is small, it suggests that for
some individuals an extreme desire for a penectomy may
be linked to their childhood exposure to such anatomically
inaccurate male action figures. Half of the penectomized
individuals (as compared to 21% of the controls), who had
played with such toys, acknowledged eroticizing their play.
It is unclear whether interest in genital ablation for these
individuals preceded play with the toys or whether interest
in such toys came from an existing displeasure with their
external genitalia. Previously, studies on girls suggest that
exposure to Barbie dolls may influence their perception
of what is an ideal body (Dittmar, Halliwell, & Ive 2006;
Rice, Prichard, Tiggemann, & Slater, 2016). Thus, there is
a possibility that, in a similar fashion, exposure to genitalless
male toys may partially contribute to the penectomy
desire by some individuals in our study sample.
In addition, we found that the penectomized individuals
were more likely to have pretended to lack male external
genitalia than the controls. Interestingly, among those in
both groups, who had pretended to lack male genitalia,
they reported starting to pretend to be genital free at about
5 years of age and after playing with genital-less male toy
figurines. To what extent this earlier experience contributed
to the desire for a penectomy or its early emergence remains
unclear. However, years (for most, decades) later, the penectomized
individuals were more likely to have attempted selfinjury
to their penis than the controls. While we did not ask
when the desire for a penectomy first arose, possibly they
delayed getting a penile ablation because they were aware of
the medical risks associated with the procedure.
There are further differences between our results and what
Vale et al. (2013) reported on childhood abuse in their study
of the Eunuch Archive community. We did not find any difference
between the penectomized men and the controls, but
Vale et al. found a higher likelihood of experiencing sexual
abuse among individuals, who had or were considering having
genital ablations, than their control group. Again, that
study included mostly castrated individuals, and the questions
for assessing childhood trauma differed between the two
studies. However, the varied results raise again the possibility
of different etiology for genital ablation for the two populations,
with penectomized individuals less likely to experience
childhood abuse than the castrated individuals. Additional
data on this topic have been collected by our team, and further
analyses are warranted.

Sexual Parameters
In this study, the religiosity levels were comparable between
the penectomized and control groups. Previously, one study
showed that individuals who had voluntary genital ablations
were more likely to attend religious services and had been
raised in a religious household than men with intact genitalia
(Vale et al., 2013). However, again, that study included participants,
who had received genital removal, with less than
10% being penectomized. Thus, their data may be skewed
toward those who were solely castrated.
We found that the self-report sexual function for the two
groups was similar. This is not surprising as penectomized individuals
were not orchiectomized and, thus, are not deprived of
natural gonadal hormones. What is intriguing is that six out
of 11 penectomized individuals reported that they still could
easily “get and keep” an erection despite having been penectomized.
Unfortunately, we do not know the extent of the penile
ablation in our sample, i.e., we do not know, for example, if the
individuals were fully or only partially penectomized. Those
with partial penectomy may still be able to have some penetrative
sex by using the remaining penile stump. Even with a
full penectomy, the roots of the penis remain embedded in the
perineum, and such individuals may still sense some erection
in the penile root.
Our data on erection contrast with what has been reported
for the majority of penectomized penile cancer patients, who
report impaired erectile function (Maddineni, Lau, & Sangar,
2009; Sosnowski et al., 2016). However, most penile cancer
is in men over the age of 55 (www.cance r.org/cance r/penil
e-cance r/cause s-risks -preve ntion /risk-facto rs.html), and older
than the average age of the penectomized individuals in our
study. This comparison suggests that the reasons for having
a penectomy may contribute to one’s perceived erection after
the procedure—penile cancer patients undergo the treatment to
survive, whereas the individuals in our study purposely sought
penectomy when it was not explicitly medically necessary.
What we cannot deduce is how much of these perceived
erections are phantom erections, where one feels an erection
in the absence of a penis (Wade & Finger, 2010). Previous
studies have indicated that some male-to-female transsexuals
(Ramachandran & McGeoch, 2008), men penectomized for
other reasons such as penile cancer (Fisher 1999; Crone-Münzebrock,
1951 as cited in Lawrence 2010) experience phantom
erections after penectomy.

Our study has several limitations. First, our sample size of individuals,
who have been penectomized but retained their testicles,
was small, i.e., just 11 individuals out of the 1023 who responded
to our survey. This affirms, however, how uncommon it is for men
to seek a penectomy in the absence of other genital modifications.
It is important to note that our survey was posted on Eunuch.org,
a website for people who are primarily interested in “castration”
and not necessarily “penectomy.” Respondents to our survey
were men with interest in all castration-related topics, but only
approximately 10% of the respondents claimed to have had genital
ablations. We know of no dedicated website for people who are
interested in penectomy alone.
Secondly, as the data were self-reported and anonymously
obtained, we cannot confirm the veracity of the data on genital
ablations. Participants were asked broadly about their genital
ablation status, but not interrogated about the exact extent of
penile tissue removal. Information on childhood experiences
is susceptible to recall bias. Lastly, regarding the controls, that
group was composed of men who visited the Eunuch Archive
website and thus have interest in genital removal, which may
be rarer in the general population.

We overestimate our performance when anticipating that we will try to persuade others & bias information search to get more positive feedback if given the opportunity; this confidence increase has a positive effect on our persuasiveness

Strategically delusional. Alice Soldà, Changxia Ke, Lionel Page & William von Hippel. Experimental Economics, Dec 16 2019. https://link.springer.com/article/10.1007/s10683-019-09636-9

Abstract: We aim to test the hypothesis that overconfidence arises as a strategy to influence others in social interactions. To address this question, we design an experiment in which participants are incentivized either to form accurate beliefs about their performance at a test, or to convince a group of other participants that they performed well. We also vary participants’ ability to gather information about their performance. Our results show that participants are more likely to (1) overestimate their performance when they anticipate that they will try to persuade others and (2) bias their information search in a manner conducive to receiving more positive feedback, when given the chance to do so. In addition, we also find suggestive evidence that this increase in confidence has a positive effect on participants’ persuasiveness.

Keywords: Overconfidence · Motivated cognition · Self-deception · Persuasion · Information sampling · Experiment

4 General discussion and conclusion

In the current research we tested the hypothesis that overconfidence emerges as a
strategy to gain an advantage in social interactions. In service of this goal, we conducted
two studies in which we manipulate participants’ anticipation of strategic
interactions and also the type of feedback they receive.
    In our design, participants undertake both a Persuasion Task and an Accuracy
Task in all treatments. By switching the order of these tasks, we can manipulate
participants’ goals (being accurate vs. persuasive). Because they were not aware of
the nature of the second task when undertaking the first task, we prevent participants
from engaging in a cost-benefit analysis between the two goals. However, we
acknowledge that this choice of design has its own limitations.
First, self-deception might be possible in between the Accuracy and Persuasion
Task in the Accuracy-first treatments. Because we did not elicit beliefs again after
the Persuasion Task in the Accuracy-first treatments, we cannot rule out this possibility
directly. However, there is empirical evidence showing that the way people
interpret information tends to be sticky. For example, Chambers and Reisberg
(1985) presented participants with the famous duck/rabbit figure, which could be
interpreted as either of these animals. They found that once participants arrived at
an initial interpretation that it was a duck, they were unable to re-interpret it as a
rabbit without seeing it again. In the same manner, our hypothesis was established
on the expectation that once an (accurate) belief is formed, it is “on record”. It can
therefore not be consciously ignored by participants (even if they have incentives
to form overconfident beliefs in the next task). Hence, without additional data, participants
would not be able to re-construe their beliefs easily in our Accuracy-first
treatment after the Accuracy Task. In contrast, if a participant does not have a prior
accurate belief “on record”, it may be easier to interpret information in a self-serving
manner. Similarly, we conjecture that once an inflated belief has been formed in
the Persuasion-first treatment through motivated reasoning, it is also hard to “debias”
it, even though the subsequent Accuracy Task required them to form the most
accurate beliefs. There is no obvious reason to believe that participants were able to
easily inflate beliefs (after forming well-calibrated beliefs in Accuracy Task) later in
the Persuasion Task, but unable to easily deflate the overconfident beliefs (formed
in the Persuasion Task) in the subsequent Accuracy Task. Our experimental results
can be seen as justifying our assumptions ex-post, because we would have not found
any treatment difference in belief elicitations if participants were able to adjust their
beliefs flexibly depending on the incentives they were given in each task.
Second, the process of writing an essay in the Persuasion Task could lead participants
to form inflated self-assessment of their performance, even in the absence of
any self-deception motives. While there is evidence showing that self-introspection
may lead to overconfident self-assessment (Wilson and LaFleur 1995), Sedikides
et al. (2007) find that written self-reflection actually decreases self-enhancement
biases and increases accuracy.34 If the writing task made it harder for the participants
to form inflated beliefs, the treatment effect identified in the Self-Chosen Information
condition might be underestimated. On the contrary if the writing task helped
them form inflated beliefs, the effect size measured in the Self-Chosen Information
condition might be overestimated. However, if the Persuasion Task itself inflated
self-beliefs, we should have observed a significant treatment (Persuasion-first vs.
Accuracy-first) difference in overconfidence in the No Information condition. The
fact that it is not the case can be seen as tentative evidence that even if the Persuasion
Task itself could inflate self-beliefs, this effect is unlikely to be big enough
to undermine the main effect we have identified in the Self-Chosen Information
Our findings from both studies support the idea that self-beliefs respond to variations
in the incentives for overconfidence. In our experiments, participants were put
in situations where they could receive higher payoffs from persuading other players
that they performed well in a knowledge test. We observe that their confidence
in their performance increased in such situations. Consistent with the interpretation
that overconfidence is induced by strategic motivated reasoning, we observe that
when given the freedom to choose their feedback, participants who were motivated
to persuade chose to receive more positive information. This choice, in turn, helped
them form more confident beliefs about their performance. Participants holding
higher beliefs tend to be more successful at persuading the reviewers that they did
well through a written essay, particularly in the laboratory study.
These results support the hypothesis that people tend to be more overconfident
when they expect that confidence might lead to interpersonal gains, which helps to
explain why overconfidence is so prevalent despite the obvious costs of having miscalibrated
beliefs. Future research should investigate whether the type of interpersonal
advantage observed in the context of this experiment can also be observed in
different strategic contexts (e.g. negotiation, competition).

USA: Evidence of a negative Flynn effect on the attention/working memory and learning trials; as expected, education level, age group, and ethnicity were significant predictors of California Verbal Learning Test performance

Cohort differences on the CVLT-II and CVLT3: evidence of a negative Flynn effect on the attention/working memory and learning trials. Lisa V. Graves, Lisa Drozdick, Troy Courville, Thomas J. Farrer, Paul E. Gilbert & Dean C. Delis. The Clinical Neuropsychologist, Dec 12 2019. https://doi.org/10.1080/13854046.2019.1699605

Objective: Although cohort effects on IQ measures have been investigated extensively, studies exploring cohort differences on verbal memory tests, and the extent to which they are influenced by socioenvironmental changes across decades (e.g. educational attainment; ethnic makeup), have been limited.

Method: We examined differences in performance between the normative samples of the CVLT-II from 1999 and the CVLT3 from 2016 to 2017 on the immediate- and delayed-recall trials, and we explored the degree to which verbal learning and memory skills might be influenced by the cohort year in which norms were collected versus demographic factors (e.g. education level).

Results: Multivariate analysis of variance tests and follow-up univariate tests yielded evidence for a negative cohort effect (also referred to as negative Flynn effect) on performance, controlling for demographic factors (p = .001). In particular, findings revealed evidence of a negative Flynn effect on the attention/working memory and learning trials (Trial 1, Trial 2, Trial 3, Trials 1–5 Total, List B; ps < .007), with no significant cohort differences found on the delayed-recall trials. As expected, education level, age group, and ethnicity were significant predictors of CVLT performance (ps < .01). Importantly, however, there were no interactions between cohort year of norms collection and education level, age group, or ethnicity on performance.

Conclusions: The clinical implications of the present findings for using word list learning and memory tests like the CVLT, and the potential role of socioenvironmental factors on the observed negative Flynn effect on the attention/working memory and learning trials, are discussed.

Keywords: Cohort differences, Flynn effect, verbal memory, California Verbal Learning Test

In the present study, we examined differences in performance on the immediate- and
delayed-recall trials between the CVLT-II and CVLT3 normative samples. Specifically,
we explored the extent to which verbal learning and memory skills were influenced
by the cohort year in which norms were collected (i.e. 1999 for the CVLT-II versus
2016–2017 for the CVLT3) versus differences in education level. Of note, differences in
education level between the CVLT-II and CVLT3 normative samples mirrored an
increase in the proportion of U.S. adults who completed post-secondary education
during the time period spanning the development of the CVLT-II and CVLT3.
The present study revealed evidence of a negative Flynn effect on the attention/
working memory and learning trials of the CVLT-II/CVLT3, with the CVLT3 cohort performing
significantly worse than the CVLT-II cohort on Trial 1, Trial 2, Trial 3, Trials 1–5
Total, and List B). In contrast, no significant cohort differences were found on the
delayed-recall trials. Consistent with past research, education level, age group, and
ethnicity were shown to be significant predictors of overall CVLT performance.
Education level and age group were positively and negatively associated with CVLT-II/
CVLT3 performance, respectively. With regard to ethnicity, performance on multiple
immediate- and delayed-recall trials was significantly higher among White and
Hispanic individuals relative to African-American individuals. Nevertheless, none of these
demographic variables were shown to have an interactive effect with cohort year of
norms collection on performance.
The present study overcomes some of the limitations of previous studies that examined
Flynn or cohort effects on learning and memory of word lists (e.g. use of relatively
small sample sizes; limited age ranges; confounding time of testing with changes in the
target words; using data harmonization techniques to convert Logical Memory scores to
CERAD Word List scores). Of note, the present study offers the advantage of using the
same word lists administered to large normative samples that represent a wide age
range and that were matched to the demographic makeup of the U.S. census at the time
that the testing occurred in order to explore potential cohort effects on a standardized
measure of verbal learning and memory. Further, the present findings are in line with
recent research suggesting that a negative Flynn effect may be occurring not only on IQ
tests, but also on measures of auditory attention/working memory and learning of word
lists. That is, given that negative cohort effects were observed only on immediate-recall
trials (and appeared to be driven by cohort differences on the first three learning trials in
particular), the present findings provide further evidence that the attention/working
memory aspects of verbal memory may be particularly vulnerable to negative cohort
effects (Wongupparaj et al., 2017).
As discussed above, the present study indicated that the CVLT3 normative sample
was more highly educated, on average, than the CVLT-II normative sample, and this
difference mirrored the increase over the past two decades in the proportion of U.S.
adults who completed post-secondary education. However, while the present study
yielded evidence for a significant positive (albeit relatively small) association between
education level and CVLT-II/CVLT3 performance, the evidence for a negative cohort
effect on performance persisted even after accounting for differences in education
level and cohort year education interactions (which were nonsignificant).
Furthermore, the observed negative cohort effect on immediate recall was present
across all age and ethnic groups, with cohort year x age group and cohort year x ethnicity
interactions also being nonsignificant.
The present results are consistent with the findings from a meta-analysis of cohort
effects on attention and working memory measures conducted by Wongupparaj et al.
(2017), who found a gradual decline in more complex auditory attention/working
memory skills (e.g. digit span backward) over the past four decades. Although the current
findings differ from those of Dodge et al. (2017), in which a positive cohort effect
was reported for word list learning and memory performance (including immediate
and delayed recall), there were a number of limitations in that study that make it difficult
to directly compare results from the two investigations (e.g. investigating only
older adults [65 years or older]; notable differences in the proportions of adults in
each age cohort within the pooled sample who were tested in the first versus second
data collection period; not addressing the fact that age cohort and period of testing
were potentially confounded; and using data harmonization techniques to convert
Logical Memory scores from a subset of its data pool into CERAD Word List scores to
facilitate analyses of cohort differences on verbal memory performance).
The results of the present study raise intriguing questions about the effects of socioenvironmental
changes that have unfolded during the time period spanning the
development of the second and third editions of the CVLT. In particular, the present
findings suggest that socioenvironmental changes may have occurred since 2000 that
(a) might be negatively impacting working memory and verbal learning skills, (b) are
not disproportionately affecting certain age or ethnic groups, and (c) are occurring
independent of generational changes in educational attainment. While education level
was examined in the present study, a number of researchers have highlighted distinctions
between educational attainment and education quality, and have suggested that
“educational attainment” as a homogeneous variable may have become diluted in
recent years due to varying standards and quality required for degrees across educational
settings (Allen & Seaman, 2013; Bratsberg & Rogeberg, 2018; Hamad et al., 2019;
Jaggars & Bailey, 2010; Nguyen et al., 2016; Rindermann et al., 2017). The lack of an
observed cohort year x education level interaction effect found in the present study
may reflect, in part, these recent concerns about the homogeneity of educational
attainment. This is important to consider for the present findings given that the negative
Flynn effect that has recently been found on IQ measures has been partly attributed
to reduced quality of education in those studies (Allen & Seaman, 2013; Jaggars
& Bailey, 2010).
It is difficult to escape the observation that the time period spanning the development
of the second and third editions of the CVLT (1999/2000 versus 2016/2017) also
coincided with a profound societal change: the digital revolution. As noted in recent
reviews (Rindermann et al., 2017; Wilmer, Sherman, & Chein, 2017), the use of digital
technology, while offering multiple advantages, may have subtle but significant
adverse effects on working memory and rote memorization skills. While relationships
between the use of digital technology and verbal learning and memory performance
were not formally investigated in the present study, the current findings invite the
intriguing hypothesis that increased use of digital tools may inadvertently have an
adverse effect on working memory and learning abilities. Unfortunately, there has
been a paucity of studies investigating associations of self-reported and performancebased
internet use with cognition. While there is evidence that the ability to perform
different tasks on the internet is significantly correlated with performance on cognitive
tests (Woods et al., 2019), no studies have directly investigated whether varying
degrees of internet, mobile phone, or other digital technology usage may positively or
negatively affect the development and maintenance of different domains of cognition.
Future research should explore potential differences between high and low internet
users on neuropsychological test performance. In addition, the present study was also
limited in that we were unable to assess relationships between other socioenvironmental
changes that may have occurred in the years spanning the development of
the CVLT-II and CVLT3 (e.g. generational changes in healthcare or standard of living;
see Dutton & Lynn, 2013; Rindermann et al., 2017).
The present findings were likely related to true cohort differences in verbal learning
and memory skills, and not to differences between the makeup of the CVLT-II versus
the CVLT3, given that (a) the lists of target words are identical across the two versions
of the test; (b) the negative cohort effect was only observed on select trials, thereby
indicating that one version is not uniformly harder or easier than the other; and (c)
other recent studies have also found evidence for a negative Flynn effect on attention/
working memory components of verbal memory (Wongupparaj et al., 2017). One
question that arises is whether the observed negative cohort effect found on the CVLT
in the present study was due to a negative Flynn effect specifically on attention/working
memory and learning skills versus a broader effect on IQ in general, which has
also been reported in recent years (Bratsberg & Rogeberg, 2018; Dutton & Lynn, 2013,
2015; Dutton et al., 2016; Flynn & Shayer, 2018; Pietschnig & Gittler, 2015; Shayer &
Ginsburg, 2007, 2009; Sundet et al., 2004; Teasdale & Owen, 2005, 2008; Woodley &
Meisenberg, 2013). Given that the CVLT-II and CVLT3 were not co-normed with IQ
tests, we cannot directly investigate this relationship. However, IQ has been shown to
correlate robustly with education level, and education was not shown to drive or moderate
any of the observed cohort effects in the present study. These findings suggest
that the present findings were related to true cohort differences in attention/working
memory and learning skills independent of any cohort changes that might also be
occurring for IQ functions in general.
It is also worth noting that the negative cohort effects observed in the present
study were associated with relatively small effect size estimates (i.e. gp
2 < .010 on
immediate-recall trials). However, the cohort effects are unlikely due to random chance
given the robust statistical power rendered by our large sample size. Moreover, from a
clinical perspective, even a small difference in raw scores can have a notable impact
on the conversion to standardized scores, which in turn can impact decisions about
an examinee’s level of cognitive functioning. For example, for an individual within the
age range of 45–54 years, a raw score of 4 on Trial 1 yields a z-score of –1.5 based on
CVLT-II norms versus a scaled score of 7 based on CVLT3 norms (note that the CVLT3
now uses scaled scores rather than z-scores); thus, this individual’s Trial 1 performance
could be interpreted as mildly impaired using CVLT-II norms and low average using
CVLT3 norms.
The present results have other important implications for clinical practice. In a
recent position paper, Bush et al. (2018) discussed the advantages and disadvantages
of using newer versus older versions of neuropsychological tests. The authors note
that an advantage of an older version of a neuropsychological test is that it may be
grounded more in empirical data supporting its validity, whereas a newer version may
lack such empirical support. Additionally, older versions of tests offer the advantage of
increased familiarity and ease of interpretation for clinicians. However, Bush et al.
(2018) also note that if cohort differences are found in the normative data between
the older and new versions of a test, then the use of the older version may provide
inaccurate standardized scores in a present-day evaluation (see also Alenius et al.,
2019). Given the present findings, the continued use of the CVLT-II’s 1999 norms in
today’s assessments may provide artificially lower standardized scores on indices of
attention/working memory and learning across the immediate-recall trials (e.g. Trial 1,
Trial 2, Trial 3, Trials 1–5 Total, List B). Further, given that the target lists and Yes/No
Recognition trial are the same on the CVLT3 as those used on the CVLT-II, 1) the validity
studies that have been conducted to date for the CVLT-II (over 1,000 published
studies; Delis et al., 2017) likely still have relevance for the CVLT3, and 2) familiarity
and ease of interpretation should be relatively equivalent across the two test versions.
Finally, the present results also suggest that the normative data that are currently
being used for other verbal learning and memory tests (e.g. California Verbal Learning
Test – Children’s Version; Rey Auditory Verbal Learning Test; Hopkins Verbal Learning
Test), which were initially collected before 2000 and have not undergone any major
revisions since the early 2000s, may also have become outdated and are in need of
re-norming in the near future.
In summary, the current study found evidence of a negative Flynn effect on the
attention/working memory and learning trials of the CVLT-II/CVLT3. The findings have
clinical implications for the use of word list learning and memory tests like the CVLT,
and raise intriguing questions about the possible adverse effects of recent socioenvironmental
changes on attention, working memory, and learning skills.