Tuesday, December 17, 2019

Voluntary castration: They felt penile removal made them more physically attractive; & at least two individuals thought that a penectomy would make them a better submissive sexual partner

Characteristics of Males Who Obtained a Voluntary Penectomy. Erik Wibowo, Samantha T. S. Wong, Richard J. Wassersug & Thomas W. Johnson. Archives of Sexual Behavior, Dec 16 2019. https://link.springer.com/article/10.1007/s10508-019-01607-8

Abstract: We report here on survey data from 11 genetic males, who had voluntary penectomies without any explicit medical need, yet did not desire testicular ablation. This group was compared to a control group of men who completed the same survey but had no genital ablation. The penectomy group was less likely to identify as male than the control group. They were also more likely to have attempted self-injury to their penis (at a median age of 41.5 years), been attracted to males without penises, and felt that they were more physically attractive without a penis than the controls. Motivations for voluntary penectomy were aesthetics (i.e., a feeling that the penile removal made them more physically attractive) or eroticism (i.e., at least two individuals thought that a penectomy would make them a better submissive sexual partner). In terms of sexual function, the penectomized and control groups reported comparable sexual function, with six penectomized individuals claiming to still be able to get and keep an erection, suggesting possible incomplete penile ablation. In their childhood, penectomized individuals were more likely than the controls to have pretended to be castrated and to have involved the absence of genitals of their toys in their childhood play. We discuss characteristics and sexual outcomes for individuals who have had a voluntary penectomy. A future study with a larger sample size on men who desire penectomies is warranted.



                              Discussion

In this study, we compared various self-reported data from
genetic males, who elected voluntary penectomy, with data
from men with intact genitals, who did not express any
desire for genital removal. We found some differences
between the two groups, in that penectomized men were
more likely (1) to identify as non-male for their gender, (2)
to have attempted self-genital injury, (3) to be attracted to
males without penises, (4) to feel attractive without penis,
(5) to have pretended to be genital-less in their childhood
and (6) to have involved the absence of genitals of their toys
in their childhood play. Other psychosexual outcomes that
can be affected by hormonal levels such as sexual function,
depression and anxiety were comparable between the two
groups.

Psychiatric Condition
From this study, a majority (8 out of 11) of the penectomized individuals
felt attractive without a penis and this feeling may have
contributed to their desire for a penectomy. In addition, at least
two individuals thought that the absence of penis allowed them
to be a better submissive sexual partner, i.e., their penectomy
desire was sexually motivated. However, we cannot explicitly
conclude if the penectomized individuals in our study have a
psychiatric condition without psychiatric evaluation. We asked if
they had been diagnosed for any medical conditions, and only two
answered, one with inflammatory bowel disease and the other
with anxiety/depression. Neither alluded to a major psychiatric
disorder. Collectively our data suggest that there are two motivations
for desiring a penectomy in our sample. Some individuals
feel that their penis is not part of their body and they may have
body integrity dysphoria. Others have a paraphilic motivation,
where they eroticize not having a penis as making them a better
submissive sexual partner.
Early Life Experience
One unexpected finding from this study was that most of
the penectomized individuals grew up in a medium-large
city, had not observed animal castration nor had been threatened
with castration in their childhood. This is in contrast
to data on individuals with an orchiectomy, many of whom
were raised on a farm, had reported participating in animal
castration and were threatened with castration during their
childhood (Vale et al., 2013). However, the Vale et al. study
included predominantly men who had just an orchiectomy,
with only 7.5% of the men penectomized. We compared data
from the penectomized men in the Vale et al. study and the
current study, but no difference was found in terms of living
conditions during childhood. The fact of being raised
in a populous location, unlike on a farm, may explain why
the penectomized individuals had never witnessed animal
castration. These findings suggest that the etiology for voluntary
penectomy and voluntary castration are likely to be
different. It remains unknown to what extent social setting
and population density influenced interest in penectomy for
these individuals.
Among those who had played with male toy figurines
that lacked external genitalia, the penectomized individuals
were more likely to notice and incorporate the absence of
genitals in their play than non-penectomized individuals.
Although our sample size is small, it suggests that for
some individuals an extreme desire for a penectomy may
be linked to their childhood exposure to such anatomically
inaccurate male action figures. Half of the penectomized
individuals (as compared to 21% of the controls), who had
played with such toys, acknowledged eroticizing their play.
It is unclear whether interest in genital ablation for these
individuals preceded play with the toys or whether interest
in such toys came from an existing displeasure with their
external genitalia. Previously, studies on girls suggest that
exposure to Barbie dolls may influence their perception
of what is an ideal body (Dittmar, Halliwell, & Ive 2006;
Rice, Prichard, Tiggemann, & Slater, 2016). Thus, there is
a possibility that, in a similar fashion, exposure to genitalless
male toys may partially contribute to the penectomy
desire by some individuals in our study sample.
In addition, we found that the penectomized individuals
were more likely to have pretended to lack male external
genitalia than the controls. Interestingly, among those in
both groups, who had pretended to lack male genitalia,
they reported starting to pretend to be genital free at about
5 years of age and after playing with genital-less male toy
figurines. To what extent this earlier experience contributed
to the desire for a penectomy or its early emergence remains
unclear. However, years (for most, decades) later, the penectomized
individuals were more likely to have attempted selfinjury
to their penis than the controls. While we did not ask
when the desire for a penectomy first arose, possibly they
delayed getting a penile ablation because they were aware of
the medical risks associated with the procedure.
There are further differences between our results and what
Vale et al. (2013) reported on childhood abuse in their study
of the Eunuch Archive community. We did not find any difference
between the penectomized men and the controls, but
Vale et al. found a higher likelihood of experiencing sexual
abuse among individuals, who had or were considering having
genital ablations, than their control group. Again, that
study included mostly castrated individuals, and the questions
for assessing childhood trauma differed between the two
studies. However, the varied results raise again the possibility
of different etiology for genital ablation for the two populations,
with penectomized individuals less likely to experience
childhood abuse than the castrated individuals. Additional
data on this topic have been collected by our team, and further
analyses are warranted.

Sexual Parameters
In this study, the religiosity levels were comparable between
the penectomized and control groups. Previously, one study
showed that individuals who had voluntary genital ablations
were more likely to attend religious services and had been
raised in a religious household than men with intact genitalia
(Vale et al., 2013). However, again, that study included participants,
who had received genital removal, with less than
10% being penectomized. Thus, their data may be skewed
toward those who were solely castrated.
We found that the self-report sexual function for the two
groups was similar. This is not surprising as penectomized individuals
were not orchiectomized and, thus, are not deprived of
natural gonadal hormones. What is intriguing is that six out
of 11 penectomized individuals reported that they still could
easily “get and keep” an erection despite having been penectomized.
Unfortunately, we do not know the extent of the penile
ablation in our sample, i.e., we do not know, for example, if the
individuals were fully or only partially penectomized. Those
with partial penectomy may still be able to have some penetrative
sex by using the remaining penile stump. Even with a
full penectomy, the roots of the penis remain embedded in the
perineum, and such individuals may still sense some erection
in the penile root.
Our data on erection contrast with what has been reported
for the majority of penectomized penile cancer patients, who
report impaired erectile function (Maddineni, Lau, & Sangar,
2009; Sosnowski et al., 2016). However, most penile cancer
is in men over the age of 55 (www.cance r.org/cance r/penil
e-cance r/cause s-risks -preve ntion /risk-facto rs.html), and older
than the average age of the penectomized individuals in our
study. This comparison suggests that the reasons for having
a penectomy may contribute to one’s perceived erection after
the procedure—penile cancer patients undergo the treatment to
survive, whereas the individuals in our study purposely sought
penectomy when it was not explicitly medically necessary.
What we cannot deduce is how much of these perceived
erections are phantom erections, where one feels an erection
in the absence of a penis (Wade & Finger, 2010). Previous
studies have indicated that some male-to-female transsexuals
(Ramachandran & McGeoch, 2008), men penectomized for
other reasons such as penile cancer (Fisher 1999; Crone-Münzebrock,
1951 as cited in Lawrence 2010) experience phantom
erections after penectomy.

Limitations
Our study has several limitations. First, our sample size of individuals,
who have been penectomized but retained their testicles,
was small, i.e., just 11 individuals out of the 1023 who responded
to our survey. This affirms, however, how uncommon it is for men
to seek a penectomy in the absence of other genital modifications.
It is important to note that our survey was posted on Eunuch.org,
a website for people who are primarily interested in “castration”
and not necessarily “penectomy.” Respondents to our survey
were men with interest in all castration-related topics, but only
approximately 10% of the respondents claimed to have had genital
ablations. We know of no dedicated website for people who are
interested in penectomy alone.
Secondly, as the data were self-reported and anonymously
obtained, we cannot confirm the veracity of the data on genital
ablations. Participants were asked broadly about their genital
ablation status, but not interrogated about the exact extent of
penile tissue removal. Information on childhood experiences
is susceptible to recall bias. Lastly, regarding the controls, that
group was composed of men who visited the Eunuch Archive
website and thus have interest in genital removal, which may
be rarer in the general population.

We overestimate our performance when anticipating that we will try to persuade others & bias information search to get more positive feedback if given the opportunity; this confidence increase has a positive effect on our persuasiveness


Strategically delusional. Alice Soldà, Changxia Ke, Lionel Page & William von Hippel. Experimental Economics, Dec 16 2019. https://link.springer.com/article/10.1007/s10683-019-09636-9

Abstract: We aim to test the hypothesis that overconfidence arises as a strategy to influence others in social interactions. To address this question, we design an experiment in which participants are incentivized either to form accurate beliefs about their performance at a test, or to convince a group of other participants that they performed well. We also vary participants’ ability to gather information about their performance. Our results show that participants are more likely to (1) overestimate their performance when they anticipate that they will try to persuade others and (2) bias their information search in a manner conducive to receiving more positive feedback, when given the chance to do so. In addition, we also find suggestive evidence that this increase in confidence has a positive effect on participants’ persuasiveness.

Keywords: Overconfidence · Motivated cognition · Self-deception · Persuasion · Information sampling · Experiment

4 General discussion and conclusion

In the current research we tested the hypothesis that overconfidence emerges as a
strategy to gain an advantage in social interactions. In service of this goal, we conducted
two studies in which we manipulate participants’ anticipation of strategic
interactions and also the type of feedback they receive.
    In our design, participants undertake both a Persuasion Task and an Accuracy
Task in all treatments. By switching the order of these tasks, we can manipulate
participants’ goals (being accurate vs. persuasive). Because they were not aware of
the nature of the second task when undertaking the first task, we prevent participants
from engaging in a cost-benefit analysis between the two goals. However, we
acknowledge that this choice of design has its own limitations.
First, self-deception might be possible in between the Accuracy and Persuasion
Task in the Accuracy-first treatments. Because we did not elicit beliefs again after
the Persuasion Task in the Accuracy-first treatments, we cannot rule out this possibility
directly. However, there is empirical evidence showing that the way people
interpret information tends to be sticky. For example, Chambers and Reisberg
(1985) presented participants with the famous duck/rabbit figure, which could be
interpreted as either of these animals. They found that once participants arrived at
an initial interpretation that it was a duck, they were unable to re-interpret it as a
rabbit without seeing it again. In the same manner, our hypothesis was established
on the expectation that once an (accurate) belief is formed, it is “on record”. It can
therefore not be consciously ignored by participants (even if they have incentives
to form overconfident beliefs in the next task). Hence, without additional data, participants
would not be able to re-construe their beliefs easily in our Accuracy-first
treatment after the Accuracy Task. In contrast, if a participant does not have a prior
accurate belief “on record”, it may be easier to interpret information in a self-serving
manner. Similarly, we conjecture that once an inflated belief has been formed in
the Persuasion-first treatment through motivated reasoning, it is also hard to “debias”
it, even though the subsequent Accuracy Task required them to form the most
accurate beliefs. There is no obvious reason to believe that participants were able to
easily inflate beliefs (after forming well-calibrated beliefs in Accuracy Task) later in
the Persuasion Task, but unable to easily deflate the overconfident beliefs (formed
in the Persuasion Task) in the subsequent Accuracy Task. Our experimental results
can be seen as justifying our assumptions ex-post, because we would have not found
any treatment difference in belief elicitations if participants were able to adjust their
beliefs flexibly depending on the incentives they were given in each task.
Second, the process of writing an essay in the Persuasion Task could lead participants
to form inflated self-assessment of their performance, even in the absence of
any self-deception motives. While there is evidence showing that self-introspection
may lead to overconfident self-assessment (Wilson and LaFleur 1995), Sedikides
et al. (2007) find that written self-reflection actually decreases self-enhancement
biases and increases accuracy.34 If the writing task made it harder for the participants
to form inflated beliefs, the treatment effect identified in the Self-Chosen Information
condition might be underestimated. On the contrary if the writing task helped
them form inflated beliefs, the effect size measured in the Self-Chosen Information
condition might be overestimated. However, if the Persuasion Task itself inflated
self-beliefs, we should have observed a significant treatment (Persuasion-first vs.
Accuracy-first) difference in overconfidence in the No Information condition. The
fact that it is not the case can be seen as tentative evidence that even if the Persuasion
Task itself could inflate self-beliefs, this effect is unlikely to be big enough
to undermine the main effect we have identified in the Self-Chosen Information
condition.
Our findings from both studies support the idea that self-beliefs respond to variations
in the incentives for overconfidence. In our experiments, participants were put
in situations where they could receive higher payoffs from persuading other players
that they performed well in a knowledge test. We observe that their confidence
in their performance increased in such situations. Consistent with the interpretation
that overconfidence is induced by strategic motivated reasoning, we observe that
when given the freedom to choose their feedback, participants who were motivated
to persuade chose to receive more positive information. This choice, in turn, helped
them form more confident beliefs about their performance. Participants holding
higher beliefs tend to be more successful at persuading the reviewers that they did
well through a written essay, particularly in the laboratory study.
These results support the hypothesis that people tend to be more overconfident
when they expect that confidence might lead to interpersonal gains, which helps to
explain why overconfidence is so prevalent despite the obvious costs of having miscalibrated
beliefs. Future research should investigate whether the type of interpersonal
advantage observed in the context of this experiment can also be observed in
different strategic contexts (e.g. negotiation, competition).

USA: Evidence of a negative Flynn effect on the attention/working memory and learning trials; as expected, education level, age group, and ethnicity were significant predictors of California Verbal Learning Test performance

Cohort differences on the CVLT-II and CVLT3: evidence of a negative Flynn effect on the attention/working memory and learning trials. Lisa V. Graves, Lisa Drozdick, Troy Courville, Thomas J. Farrer, Paul E. Gilbert & Dean C. Delis. The Clinical Neuropsychologist, Dec 12 2019. https://doi.org/10.1080/13854046.2019.1699605

Abstract
Objective: Although cohort effects on IQ measures have been investigated extensively, studies exploring cohort differences on verbal memory tests, and the extent to which they are influenced by socioenvironmental changes across decades (e.g. educational attainment; ethnic makeup), have been limited.

Method: We examined differences in performance between the normative samples of the CVLT-II from 1999 and the CVLT3 from 2016 to 2017 on the immediate- and delayed-recall trials, and we explored the degree to which verbal learning and memory skills might be influenced by the cohort year in which norms were collected versus demographic factors (e.g. education level).

Results: Multivariate analysis of variance tests and follow-up univariate tests yielded evidence for a negative cohort effect (also referred to as negative Flynn effect) on performance, controlling for demographic factors (p = .001). In particular, findings revealed evidence of a negative Flynn effect on the attention/working memory and learning trials (Trial 1, Trial 2, Trial 3, Trials 1–5 Total, List B; ps < .007), with no significant cohort differences found on the delayed-recall trials. As expected, education level, age group, and ethnicity were significant predictors of CVLT performance (ps < .01). Importantly, however, there were no interactions between cohort year of norms collection and education level, age group, or ethnicity on performance.

Conclusions: The clinical implications of the present findings for using word list learning and memory tests like the CVLT, and the potential role of socioenvironmental factors on the observed negative Flynn effect on the attention/working memory and learning trials, are discussed.

Keywords: Cohort differences, Flynn effect, verbal memory, California Verbal Learning Test


Discussion
In the present study, we examined differences in performance on the immediate- and
delayed-recall trials between the CVLT-II and CVLT3 normative samples. Specifically,
we explored the extent to which verbal learning and memory skills were influenced
by the cohort year in which norms were collected (i.e. 1999 for the CVLT-II versus
2016–2017 for the CVLT3) versus differences in education level. Of note, differences in
education level between the CVLT-II and CVLT3 normative samples mirrored an
increase in the proportion of U.S. adults who completed post-secondary education
during the time period spanning the development of the CVLT-II and CVLT3.
The present study revealed evidence of a negative Flynn effect on the attention/
working memory and learning trials of the CVLT-II/CVLT3, with the CVLT3 cohort performing
significantly worse than the CVLT-II cohort on Trial 1, Trial 2, Trial 3, Trials 1–5
Total, and List B). In contrast, no significant cohort differences were found on the
delayed-recall trials. Consistent with past research, education level, age group, and
ethnicity were shown to be significant predictors of overall CVLT performance.
Education level and age group were positively and negatively associated with CVLT-II/
CVLT3 performance, respectively. With regard to ethnicity, performance on multiple
immediate- and delayed-recall trials was significantly higher among White and
Hispanic individuals relative to African-American individuals. Nevertheless, none of these
demographic variables were shown to have an interactive effect with cohort year of
norms collection on performance.
The present study overcomes some of the limitations of previous studies that examined
Flynn or cohort effects on learning and memory of word lists (e.g. use of relatively
small sample sizes; limited age ranges; confounding time of testing with changes in the
target words; using data harmonization techniques to convert Logical Memory scores to
CERAD Word List scores). Of note, the present study offers the advantage of using the
same word lists administered to large normative samples that represent a wide age
range and that were matched to the demographic makeup of the U.S. census at the time
that the testing occurred in order to explore potential cohort effects on a standardized
measure of verbal learning and memory. Further, the present findings are in line with
recent research suggesting that a negative Flynn effect may be occurring not only on IQ
tests, but also on measures of auditory attention/working memory and learning of word
lists. That is, given that negative cohort effects were observed only on immediate-recall
trials (and appeared to be driven by cohort differences on the first three learning trials in
particular), the present findings provide further evidence that the attention/working
memory aspects of verbal memory may be particularly vulnerable to negative cohort
effects (Wongupparaj et al., 2017).
As discussed above, the present study indicated that the CVLT3 normative sample
was more highly educated, on average, than the CVLT-II normative sample, and this
difference mirrored the increase over the past two decades in the proportion of U.S.
adults who completed post-secondary education. However, while the present study
yielded evidence for a significant positive (albeit relatively small) association between
education level and CVLT-II/CVLT3 performance, the evidence for a negative cohort
effect on performance persisted even after accounting for differences in education
level and cohort year education interactions (which were nonsignificant).
Furthermore, the observed negative cohort effect on immediate recall was present
across all age and ethnic groups, with cohort year x age group and cohort year x ethnicity
interactions also being nonsignificant.
The present results are consistent with the findings from a meta-analysis of cohort
effects on attention and working memory measures conducted by Wongupparaj et al.
(2017), who found a gradual decline in more complex auditory attention/working
memory skills (e.g. digit span backward) over the past four decades. Although the current
findings differ from those of Dodge et al. (2017), in which a positive cohort effect
was reported for word list learning and memory performance (including immediate
and delayed recall), there were a number of limitations in that study that make it difficult
to directly compare results from the two investigations (e.g. investigating only
older adults [65 years or older]; notable differences in the proportions of adults in
each age cohort within the pooled sample who were tested in the first versus second
data collection period; not addressing the fact that age cohort and period of testing
were potentially confounded; and using data harmonization techniques to convert
Logical Memory scores from a subset of its data pool into CERAD Word List scores to
facilitate analyses of cohort differences on verbal memory performance).
The results of the present study raise intriguing questions about the effects of socioenvironmental
changes that have unfolded during the time period spanning the
development of the second and third editions of the CVLT. In particular, the present
findings suggest that socioenvironmental changes may have occurred since 2000 that
(a) might be negatively impacting working memory and verbal learning skills, (b) are
not disproportionately affecting certain age or ethnic groups, and (c) are occurring
independent of generational changes in educational attainment. While education level
was examined in the present study, a number of researchers have highlighted distinctions
between educational attainment and education quality, and have suggested that
“educational attainment” as a homogeneous variable may have become diluted in
recent years due to varying standards and quality required for degrees across educational
settings (Allen & Seaman, 2013; Bratsberg & Rogeberg, 2018; Hamad et al., 2019;
Jaggars & Bailey, 2010; Nguyen et al., 2016; Rindermann et al., 2017). The lack of an
observed cohort year x education level interaction effect found in the present study
may reflect, in part, these recent concerns about the homogeneity of educational
attainment. This is important to consider for the present findings given that the negative
Flynn effect that has recently been found on IQ measures has been partly attributed
to reduced quality of education in those studies (Allen & Seaman, 2013; Jaggars
& Bailey, 2010).
It is difficult to escape the observation that the time period spanning the development
of the second and third editions of the CVLT (1999/2000 versus 2016/2017) also
coincided with a profound societal change: the digital revolution. As noted in recent
reviews (Rindermann et al., 2017; Wilmer, Sherman, & Chein, 2017), the use of digital
technology, while offering multiple advantages, may have subtle but significant
adverse effects on working memory and rote memorization skills. While relationships
between the use of digital technology and verbal learning and memory performance
were not formally investigated in the present study, the current findings invite the
intriguing hypothesis that increased use of digital tools may inadvertently have an
adverse effect on working memory and learning abilities. Unfortunately, there has
been a paucity of studies investigating associations of self-reported and performancebased
internet use with cognition. While there is evidence that the ability to perform
different tasks on the internet is significantly correlated with performance on cognitive
tests (Woods et al., 2019), no studies have directly investigated whether varying
degrees of internet, mobile phone, or other digital technology usage may positively or
negatively affect the development and maintenance of different domains of cognition.
Future research should explore potential differences between high and low internet
users on neuropsychological test performance. In addition, the present study was also
limited in that we were unable to assess relationships between other socioenvironmental
changes that may have occurred in the years spanning the development of
the CVLT-II and CVLT3 (e.g. generational changes in healthcare or standard of living;
see Dutton & Lynn, 2013; Rindermann et al., 2017).
The present findings were likely related to true cohort differences in verbal learning
and memory skills, and not to differences between the makeup of the CVLT-II versus
the CVLT3, given that (a) the lists of target words are identical across the two versions
of the test; (b) the negative cohort effect was only observed on select trials, thereby
indicating that one version is not uniformly harder or easier than the other; and (c)
other recent studies have also found evidence for a negative Flynn effect on attention/
working memory components of verbal memory (Wongupparaj et al., 2017). One
question that arises is whether the observed negative cohort effect found on the CVLT
in the present study was due to a negative Flynn effect specifically on attention/working
memory and learning skills versus a broader effect on IQ in general, which has
also been reported in recent years (Bratsberg & Rogeberg, 2018; Dutton & Lynn, 2013,
2015; Dutton et al., 2016; Flynn & Shayer, 2018; Pietschnig & Gittler, 2015; Shayer &
Ginsburg, 2007, 2009; Sundet et al., 2004; Teasdale & Owen, 2005, 2008; Woodley &
Meisenberg, 2013). Given that the CVLT-II and CVLT3 were not co-normed with IQ
tests, we cannot directly investigate this relationship. However, IQ has been shown to
correlate robustly with education level, and education was not shown to drive or moderate
any of the observed cohort effects in the present study. These findings suggest
that the present findings were related to true cohort differences in attention/working
memory and learning skills independent of any cohort changes that might also be
occurring for IQ functions in general.
It is also worth noting that the negative cohort effects observed in the present
study were associated with relatively small effect size estimates (i.e. gp
2 < .010 on
immediate-recall trials). However, the cohort effects are unlikely due to random chance
given the robust statistical power rendered by our large sample size. Moreover, from a
clinical perspective, even a small difference in raw scores can have a notable impact
on the conversion to standardized scores, which in turn can impact decisions about
an examinee’s level of cognitive functioning. For example, for an individual within the
age range of 45–54 years, a raw score of 4 on Trial 1 yields a z-score of –1.5 based on
CVLT-II norms versus a scaled score of 7 based on CVLT3 norms (note that the CVLT3
now uses scaled scores rather than z-scores); thus, this individual’s Trial 1 performance
could be interpreted as mildly impaired using CVLT-II norms and low average using
CVLT3 norms.
The present results have other important implications for clinical practice. In a
recent position paper, Bush et al. (2018) discussed the advantages and disadvantages
of using newer versus older versions of neuropsychological tests. The authors note
that an advantage of an older version of a neuropsychological test is that it may be
grounded more in empirical data supporting its validity, whereas a newer version may
lack such empirical support. Additionally, older versions of tests offer the advantage of
increased familiarity and ease of interpretation for clinicians. However, Bush et al.
(2018) also note that if cohort differences are found in the normative data between
the older and new versions of a test, then the use of the older version may provide
inaccurate standardized scores in a present-day evaluation (see also Alenius et al.,
2019). Given the present findings, the continued use of the CVLT-II’s 1999 norms in
today’s assessments may provide artificially lower standardized scores on indices of
attention/working memory and learning across the immediate-recall trials (e.g. Trial 1,
Trial 2, Trial 3, Trials 1–5 Total, List B). Further, given that the target lists and Yes/No
Recognition trial are the same on the CVLT3 as those used on the CVLT-II, 1) the validity
studies that have been conducted to date for the CVLT-II (over 1,000 published
studies; Delis et al., 2017) likely still have relevance for the CVLT3, and 2) familiarity
and ease of interpretation should be relatively equivalent across the two test versions.
Finally, the present results also suggest that the normative data that are currently
being used for other verbal learning and memory tests (e.g. California Verbal Learning
Test – Children’s Version; Rey Auditory Verbal Learning Test; Hopkins Verbal Learning
Test), which were initially collected before 2000 and have not undergone any major
revisions since the early 2000s, may also have become outdated and are in need of
re-norming in the near future.
In summary, the current study found evidence of a negative Flynn effect on the
attention/working memory and learning trials of the CVLT-II/CVLT3. The findings have
clinical implications for the use of word list learning and memory tests like the CVLT,
and raise intriguing questions about the possible adverse effects of recent socioenvironmental
changes on attention, working memory, and learning skills.

Monday, December 16, 2019

Third parties’ anger at transgressors, and their intervention and punishment on behalf of victims, varied in real-life conflicts as a function of how much third parties valued the welfare of the disputants

When and Why Do Third Parties Punish Outside of the Lab? A Cross-Cultural Recall Study. Eric J. Pedersen et al. Social Psychological and Personality Science, December 16, 2019. https://doi.org/10.1177/1948550619884565

Abstract: Punishment can reform uncooperative behavior and hence could have contributed to humans’ ability to live in large-scale societies. Punishment by unaffected third parties has received extensive scientific scrutiny because third parties punish transgressors in laboratory experiments on behalf of strangers that they will never interact with again. Often overlooked in this research are interactions involving people who are not strangers, which constitute many interactions beyond the laboratory. Across three samples in two countries (United States and Japan; N = 1,294), we found that third parties’ anger at transgressors, and their intervention and punishment on behalf of victims, varied in real-life conflicts as a function of how much third parties valued the welfare of the disputants. Punishment was rare (1–2%) when third parties did not value the welfare of the victim, suggesting that previous economic game results have overestimated third parties’ willingness to punish transgressors on behalf of strangers.

Keywords: third-party punishment, anger, cooperation, bystander intervention

WTR = welfare trade-off ratio

Discussion
Here, we proposed that a major function of third-party punishment
is to deter aggressors from harming individuals with
whom the punisher shares a fitness interest and that the psychological
mechanisms that regulate punishment take into account
the punisher’s perceived welfare interdependence with the disputants
in a conflict (Pedersen et al., 2018). To test these
hypotheses, we asked U.S. students, U.S. Mechanical Turk
workers, and Japanese students to recall how they responded
the last time they observed a conflict. The recall study method
ensures a wide sampling of situations and thus high generalizability
to real-life conflicts. We found that third parties’ WTRs
for the victim in a conflict indeed predicted anger, intervention,
and punishment on behalf of the victim. We also found that
third parties’ WTRs for the transgressor were negatively associated
with anger toward the transgressor but not with intervention
or punishment as we had predicted. Besides the possibility
that WTR for the transgressor truly does not predict intervention
and punishment, one possibility for the lack of these associations
is that third parties who intervene or punish may
temporarily hold a negative WTR for the transgressor—that
is, they are willing to incur costs to inflict costs. Because our
WTR Scale only went down to 0, any negative WTRs would
have manifested as zeros and thus the variability of the scale
could have been restricted (see Figure S2, which suggests this
may have been the case), which would limit our power to detect
an effect.
These findings were generally consistent across our three
samples and never differed in kind, only magnitude. For intervention
and punishment, the effect of third parties’ WTR for
the victim was constant across all samples, though Japanese
students intervened and punished less often than either U.S.
sample. For anger, there were minor differences among the
samples in the magnitude of the effects of third parties’ WTRs
for the victim and the transgressor, but they remained in the
same, predicted directions in all samples. Thus, we have initial
evidence that our findings are at least somewhat generalizable
beyond a U.S. student population, both to a more general U.S.
population and to Japanese students.
The low model-predicted probabilities of punishment
( .02) we found when WTR for the victim was 0 suggest that
the frequency of third-party punishment has likely been overstated
in the literature that has focused on results from
laboratory-based experimental economics games (for similarly
low rates of punishment in naturalistic settings, see Balafoutas
et al., 2014, 2016). Thus, in addition to providing support for
our hypotheses that third-party anger, intervention, and punishment
vary as a function of the prospective punisher’s WTRs
toward disputants in a conflict, the present study adds to a
growing body of evidence suggesting that direct third-party
punishment on behalf of strangers is not a common feature of
human cooperation (Guala, 2012; Krasnow et al., 2012,
2016; Kriss, Weber, & Xiao, 2016; Pedersen et al., 2013,
2018; Phillips & Cooney, 2005).
We recognize that some might view our design choices here
as restrictive because we limited our scope to conflicts where
there was a direct harm to a victim and only considered intervention
and punishment that occurred in the moment. These
were intentional choices to mimic the types of interactions that
are created in the third-party punishment game (Fehr & Fischbacher,
2004), which typically shows that a majority of people
anonymously engage in immediate, uncoordinated, costly punishment
on behalf of victims. These findings have been generalized
to draw conclusions about humans’ willingness to
directly punish transgressions and what this implies for the evolution
of cooperation in humans (Fehr & Fischbacher, 2003;
Henrich et al., 2010, 2006; for review, see Pedersen et al.,
2018). Our results suggest that people are much less likely to
engage in this type of punishment than a direct generalization
of previous laboratory experiments would imply, though perhaps
future studies will show higher rates of after-the-fact punishment
with low-WTR parties than we found here. Thus, it is
important to note that our data cannot speak directly to a
broader range of social norm violations, some of which could
be likely to evoke punishment. Additionally, we did not focus
on indirect types of retaliation, such as gossip, or other mechanisms
that are likely important to maintaining cooperation and
social norms, such as partner choice.
Additionally, the higher rate of intervention than punishment
we observed here comports well with evidence suggesting
that people prefer alternatives to punishment (e.g.,
helping the victim) when they are available (Balafoutas et al.,
2014, 2016; Chavez & Bicchieri, 2013). It also suggests that
shifting focus beyond punishment could be a fruitful approach
to more fully understanding how third parties respond to conflicts
in the real world. We do notice that the amount of
reported intervention could have been inflated due to our asking
subjects to report whether they had “helped” either person
involved in the conflict, though this was asked after subjects
had already chosen a particular conflict to recall and thus
probably did not bias the choice of event in the first place.
It is also possible that our prompt elicited different recollections
between the U.S. and Japanese samples, which could
explain the difference intervention and punishment rates
between the countries.
This study had some limitations. First, memory limitations
may have prevented people from accurately recalling the
details of past events. For example, subjects’ WTRs for the victims
and transgressors were retrospective; consequently, they
might have been disproportionately reflective of their current
WTRs for the victims and transgressors. Indeed, it is possible
that choosing to intervene or punish increased subjects’ commitment
toward victims and thus could have increased their
WTRs. Although we cannot rule this possibility out given the
nature of our data, we do note that recalled WTRs varied
expectedly as a function of the relationship between subjects
and the victims (see Supplement Material), which suggests that
reported WTRs did at least moderately correspond to the existing
relationships.
Second, subjects’ reports might have been distorted by
socially desirable responding. The low levels of punishment
speak against this concern, but it might have played a role in
intervention responses. The possibility of socially desirable
responding in combination with our exclusion of cases a priori
from situations in which the costs of intervening were very
steep (e.g., conflicts involving guns, multiple transgressors)
leads us to believe that the current study did not underestimate
intervention and punishment frequency. Finally, we did not
code for consolation—attempting to make the victim feel better
after the conflict had ended—and instead treated it as the
same as doing nothing because it had no material effect on the
conflict as it was occurring. Although consolation is certainly
a much less costly helping behavior, it nevertheless may help
the victim and is an important area for future research (De
Waal, 2008).
To conclude, the present investigation moved beyond the
question, “do people punish on behalf of strangers,” to ask,
“when and why do people intervene on behalf of others?” Our
method sampled intervention and punishment decisions
across a wide range of situations and multiple populations,
complementing studies that have examined punishment (and
the desire to punish) in specific real-life situations (Balafoutas
et al., 2014, 2016; Hofmann et al., 2018). Our results converged
with results from these other studies, suggesting that
intervention is much more common than punishment in everyday
life. Perceived welfare interdependence with the victim
emerged as the strongest predictor of intervention and punishment,
signaling its promise as an explanation of involvement
of others’ affairs.

Personality traits of the most intelligent: They have higher internal consistency estimates, greater scale variances, and slightly larger scale ranges

A test of the differentiation of personality by intelligence hypothesis using the Big Five personality factors. Julie Aitken Schermer, Denis Bratko, Jelena Matić Bojić. Personality and Individual Differences, Volume 156, April 1 2020, 109764. https://doi.org/10.1016/j.paid.2019.109764

Abstract: The hypothesis that personality is more differentiated, or variable, for individuals higher in intelligence was tested in a large sample (N = 1,050) of young Croatian adults. Participants completed a measure of the Big Five personality factors in self-report format. Also administered was a verbal ability test as an estimate of intelligence. As the verbal ability scores had a normal distribution, tertile splits were created and the lower group's means, standard deviations, scale ranges, and the coefficient alpha for each scale. The higher ability tertile had higher internal consistency estimates, greater scale variances, and slightly larger scale ranges. The results therefore provide some support for the differentiation of personality by intelligence hypothesis and do suggest that personality scale responses may differ depending on the intelligence level of the sample.

The traditional perspective on the ideology-prejudice relationship suggests that conservatism and associated traits (e.g. low cognitive ability, low openness) are associated with prejudice

Ideological (A)symmetries in prejudice and intergroup bias, Jarret T Crawford, Mark J Brandt. Current Opinion in Behavioral Sciences, Volume 34, August 2020, Pages 40-45. https://doi.org/10.1016/j.cobeha.2019.11.007

Highlights
• The traditional perspective on the ideology-prejudice relationship suggests that conservatism and associated traits (e.g. low cognitive ability, low openness) are associated with prejudice.
• The worldview conflict perspective challenges the traditional perspective by testing prejudice against a more heterogeneous array of target groups.
• Research from the worldview conflict perspective shows that both liberals and conservatives (as well as those low and high in several associated traits) are prejudiced against dissimilar groups.
• There are obvious ways, in which these perspectives differ, but also some common ground (e.g. presumption of some underlying psychological differences between liberals and conservatives).
• This still leaves open questions related to the robustness of underlying assumptions, differences between elites and the public, precise causal processes, and worldview conflict reduction.


Abstract: The traditional perspective on the political ideology and prejudice relationship holds that political conservatism is associated with prejudice, and that the types of dispositional characteristics associated with conservatism (e.g. low cognitive ability, low Openness) explain this relationship. This conclusion is limited by the limited number and types of groups studied. When researchers use a more heterogeneous array of targets, people across the political spectrum express prejudice against groups with dissimilar values and beliefs. Evidence for this worldview conflict perspective emerges in both politics and religion, as well as individual differences such as Openness, disgust sensitivity and cognitive ability. Although these two perspectives differ substantially, there is some identifiable common ground between them, particularly the assumption of some psychological differences between liberals and conservatives. We discuss some remaining open questions related to worldview conflict reduction, causal processes, the robustness of the assumptions of the traditional perspective, and differences between political elites and the public.



‘Good genes’ is the default explanation for the evolution of elaborate ornaments, despite abundant evidence that the most attractive mates are seldom those that produce the most viable offspring

It’s Not about Him: Mismeasuring ‘Good Genes’ in Sexual Selection. Angela M. Achorn, Gil G. Rosenthal. Trends in Ecology & Evolution, December 16 2019, https://doi.org/10.1016/j.tree.2019.11.007

Highlights
. ‘Good genes’ remains the default explanation for the evolution of elaborate ornaments, despite abundant evidence that the most attractive mates are seldom those that produce the most viable offspring.
. ‘Good genes’, in which preferred traits predict offspring viability, is often conflated with other indirect benefits, including genetic compatibility, heterozygosity, and offspring attractiveness.
. Few studies in fact test the key predictions of ‘good genes’ models and, as predicted by theory, they show scant evidence for additive effects of mating decisions on offspring viability.
. Direct tests of indirect genetic benefits should measure the attractiveness and viability of offspring from a large number of matings, distinguish between additive and nonadditive benefits, and control for differential investment in offspring.

Abstract: What explains preferences for elaborate ornamentation in animals? The default answer remains that the prettiest males have the best genes. If mating signals predict good genes, mating preferences evolve because attractive mates yield additive genetic benefits through offspring viability, thereby maximizing chooser fitness. Across disciplines, studies claim ‘good genes’ without measuring mating preferences, measuring offspring viability, distinguishing between additive and nonadditive benefits, or controlling for manipulation of chooser investment. Crucially, studies continue to assert benefits to choosers purely based on signal costs to signalers. A focus on fitness outcomes for choosers suggests that ‘good genes’ are insufficient to explain the evolution of mate choice or of sexual ornamentation.

Keywords: genetic qualitymate choiceindirect benefitsgenetic benefits


Reevaluating the Evidence: ‘Good Genes’ in Context

The question of whether sexual selection is good or bad for choosers and populations is a central one in evolutionary biology, animal communication, and conservation biology. By focusing on signals and signal costs, studies often fail to test the basic premise that ornaments are the target of mate choice [46], let alone that they confer any benefits on choosers.

When studies do test for ‘good genes’, evidence suggests that this process accounts for a modest fraction of variance in sexual fitness [2,6,13]. A recent meta-analysis [6] showed that attractiveness was highly heritable, consistent with FLK models, but good genes received mixed support. Attractiveness did not correlate with traits directly associated with fitness (life-history traits). However, attractiveness did positively correlate with physiological traits, such as immunocompetence and condition.

Similarly, recent studies provide at best mixed support for the intuition that the prettiest males have the best genes, although perhaps the most tenacious males do. The clearest evidence for good genes (Table 1) has found them for traits where choosers have limited agency to make mating decisions. Persistent courtship or mating is likely to increase courter success no matter how choosers behave before mating [47], and frequently impose direct costs on females [48]. The one study that controlled for differential allocation [38] yielded equivocal results. There is widespread evidence that choosers invest more in the offspring of attractive males, but more-ornamented courters may manipulate choosers into investing in a mating beyond the lifetime fitness optima of the choosers.

For conspicuous display traits, weak signals of good genes should be the rule. A seeming paradox of good genes models is that preferences for good genes are most likely to be maintained if genetic effects on viability are weak, since this slows the depletion of genetic variation by selection [49]. ‘Good genes’ are likely to be less important to preference evolution than self-reinforcing coevolution channeled by mating biases [13] and direct selection on mating decisions [13,14] (see Outstanding Questions).

More rigorous measures of ‘good genes’ speak to another central question, namely whether sexual selection is more likely to confer a positive or negative effect on population mean fitness [50]. On the one hand, populations can benefit from accelerated purifying selection through sexual selection on the courting sex, meaning that sexual selection can increase population fitness if there is a positive correlation between preference and fitness. On the other hand, sexual selection can decrease population fitness through reduced viability as a consequence of sexual conflict.

The answer likely depends on the nature of selection experienced by populations. When competing males are parasitized, sexually successful male fruit flies (Drosophila melanogaster) sire more parasite-resistant offspring, while the opposite holds true for winners of contests between unparasitized males [24] (Table 1). Along these lines, a recent meta-analysis [50] used 459 effect sizes from 65 experimental studies in which researchers manipulated the presence or strength of sexual selection, encompassing both intrasexual selection and mate choice, and then measured some aspect of fitness. The results indicate that sexual selection tends to increase population fitness, particularly when populations are exposed to novel environmental conditions.

However, in contrast to Prokop and colleagues’ meta-analysis [6], fitness traits related to immunocompetence were an exception: sexual selection covaried with weaker immunity.

Concluding Remarks

Few studies evaluate the critical predictions of benefit models of mate choice. Those that do suggest that good genes have an important role in adapting to novel environments, but that they may be more important in terms of glimpses we have suggests that good genes provide an important, if circumscribed, contribution to chooser fitness.

We can begin to tease apart the selective forces shaping mating preferences, but tests of good genes hypotheses must assess meaningful measures of offspring viability. Assessing offspring viability is conceptually straightforward, if not always easy in wild populations. This can be done by using molecular markers to reconstruct pedigrees and correlating preferred trait expression with survivorship to maturity [36], or by examining proxy measures of viability, such as juvenile growth rate or size [39]. This approach does not rule out the possibility of differential allocation. If choosers invest more in the offspring of attractive partners even at the expense of their own lifetime reproductive success, then mates may not be providing a net increase in average offspring viability [51,52]. Artificial insemination or, in externally fertilizing species, in vitro assays [8] can control for differential allocation.

A counterintuitive point about ‘good genes’ and ‘genetic quality’ is that, by definition, they are difficult, if not impossible, to infer from courter genotypic data alone. Notably, a gene favored by selection (e.g., a gene that buffers oxidative stress and helps produce attractive offspring) can carry a higher genetic load with respect to other components of viability [53,54]. An allele that is ‘good’ with respect to courter function may be in linkage disequilibrium with alleles that reduce (or increase) offspring viability. Again, direct measures of fitness are required to measure ‘good genes’. A major challenge to testing for good genes is that these effects are predicted to be weak [6,55] and, therefore, require large sample sizes. For large, long-lived animals with small populations (e.g., nonhuman primates), longitudinal samples across generations provide a feasible, if slow, approach to detecting viability consequences of mate choice. Simply measuring correlations between ornament elaboration and other courter phenotypes does not distinguish among models of signal evolution through mate choice.

‘Good genes’ is appealing because it assigns utilitarian explanations to seemingly extravagant traits and the desires that shape them. There is allure to the idea that mating preferences can increase population mean fitness and local adaptation. A loose construction of ‘good genes’ and ‘genetic quality’ remains the default explanation for mating preferences and sexual dimorphisms outside the immediate field of sexual selection and in the popular literature. We suggest that the persistence of this default view comes from a sloppy conception of these terms that leads to insufficient empirical tests of adaptive hypotheses.

Unfortunately, ‘good genes’ especially lends itself to what Bateson [56], writing about the term ‘mate selection,’ termed ‘unconscious punning’. The term conjures up so much more than ‘breeding value for viability.’ There is a precise technical term, coined by Galton in 1883 [57], which means ‘good genes’ or ‘true genes’ in Greek. Eugenics is tainted forever by the policies it incited, but we remain entranced by the intuition that Beauty marches in lockstep with Truth. Thus, evidence for ‘good genes’ and ‘genetic quality’ in the vernacular sense, is conflated with support for precise evolutionary models. We would hesitate to study the foraging ecology of koalas exclusively by grinding up eucalyptus leaves, but this is all too often the logic we invoke to study mate choice [46]. A perspective centered on choosers, rather than on the signatures that their choices leave on courters, is essential for understanding mate choice and its consequences.

The harsher grading policies in STEM courses disproportionately affect women; restrictions on grading policies that equalize average grades across classes helps to close the STEM gender gap as well as increasing overall enrollment

Equilibrium Grade Inflation with Implications for Female Interest in STEM Majors. Thomas Ahn, Peter Arcidiacono, Amy Hopson, James R. Thomas. NBER Working Paper No. 26556. December 2019. https://www.nber.org/papers/w26556

Abstract: Substantial earnings differences exist across majors with the majors that pay well also having lower grades and higher workloads. We show that the harsher grading policies in STEM courses disproportionately affect women. To show this, we estimate a model of student demand courses and optimal effort choices of students conditional on the chosen courses. Instructor grading policies are treated as equilibrium objects that in part depend on student demand for courses. Restrictions on grading policies that equalize average grades across classes helps to close the STEM gender gap as well as increasing overall enrollment in STEM classes.


5.3 Grade estimates
The estimated αs, the department-specific ability weights, are given in Table 6. These are calculated by taking the reduced-form θs, undoing the normalization on the γs, and subtracting off the part of the reduced form that θs that reflect the study time (taken from ψ). The departments are sorted such that those with the lowest female estimate are listed first. Note that in all departments the female estimate is negative. This occurs because females study substantially more than males yet receive only slightly higher grades. Given that sorting into universities takes place on both cognitive and non-cognitive skills and that women have a comparative advantage in non-cognitive skills, males at UK have higher cognitive skills than their female counterpart even though in the population cognitive skills are similar between men and women. Negative estimates are also found for Hispanics. While Hispanics have higher grades than African Americans, our estimates of the study costs suggested that they also studied substantially more. Given the very high estimate of Hispanic study time we would have expected Hispanics to perform even better in the classroom than they actually did if their baseline abilities were similar to African Americans. With the estimates of the grading equation, we can reported expected grades for an average student. We do this for freshmen, separately by gender, both unconditionally and conditional on taking courses in that department in the semester we study. Results are presented in Table 7. Three patterns stand out. First, there is positive selection into STEM courses: generally those who take STEM classes are expected to perform better than the average student. This is the not the case for many departments. Indeed, the second pattern is that negative selection is more likely to occur in departments with higher grades. Finally, women are disproportionately represented in departments that give higher grades for the average student. Of the seven departments that give the highest grades for the average student (female or male), all have a larger fraction female than the overall population. In contrast, of the five departments that give the lowest grades (STEM and Economics), females are under-represented relative to the overall population in all but one (Biology).



Wealth Taxation in the United States: The effect of the Swiss tax and Warren tax on wealth inequality is miniscule, lowering the Gini coefficient by at most 0.0005 Gini points

Wealth Taxation in the United States. Edward N. Wolff. NBER Working Paper No. 26544. December 2019. https://www.nber.org/papers/w26544

Abstract: The paper analyzes the fiscal effects of a Swiss-type tax on household wealth, with a $120,000 exemption and marginal tax rates running from 0.05 to 0.3 percent on $2,400,000 or more of wealth. It also considers a wealth tax proposed by Senator Elizabeth Warren with a $50,000,000 exemption, a two percent tax on wealth above that and a one percent surcharge on wealth above $1,000,000,000. Based on the 2016 Survey of Consumer Finances, the Swiss tax would yield $189.3 billion and the Warren tax $303.4 billion. Only 0.07 percent of households would pay the Warren tax, compared to 44.3 percent for the Swiss tax. The Swiss tax would have a very small effect on income inequality, lowering the post-tax Gini coefficient by 0.004 Gini points. The effect of the Swiss tax and Warren tax on wealth inequality is miniscule, lowering the Gini coefficient by at most 0.0005 Gini points.




The (In)accuracy of Forecast Revisions in a Football Score Prediction Game: Better go with your gut instinct

Going with your Gut: The (In)accuracy of Forecast Revisions in a Football Score Prediction Game. Carl Singleton, James Reade, Alasdair Brown. Journal of Behavioral and Experimental Economics, December 16 2019, 101502. https://doi.org/10.1016/j.socec.2019.101502

Highlights
•    Judgement revisions led to worse performance in a football score prediction game
•    This is robust to the average forecasting ability of individuals playing the game
•    Revisions to the forecast number of goals scored in matches are generally excessive

Abstract: This paper studies 150 individuals who each chose to forecast the outcome of 380 fixed events, namely all football matches during the 2017/18 season of the English Premier League. The focus is on whether revisions to these forecasts before the matches began improved the likelihood of predicting correct scorelines and results. Against what theory might expect, we show how these revisions tended towards significantly worse forecasting performance, suggesting that individuals should have stuck with their initial judgements, or their ‘gut instincts’. This result is robust to both differences in the average forecasting ability of individuals and the predictability of matches. We find evidence this is because revisions to the forecast number of goals scored in football matches are generally excessive, especially when these forecasts were increased rather than decreased.



6  Summary and further discussion

In this paper, we have analysed the forecasting performance of individuals who each applied
their judgement to predict the outcomes of many fixed events. The context of this analysis was
the scoreline outcomes of professional football matches. We found that when individuals made
revisions their likelihood of predicting a correct scoreline, which they achieved around 9% of
the time when never making a revision, significantly decreased. The same applied for forecast
revisions to the result outcomes of matches. Not only were these findings robust to unobserved
individual forecasting ability and the predictability of events, but also there is evidence that
performance would have improved had initial judgements been followed.

As already mentioned, these results have some similarities with those found previously in
the behavioural forecasting literature. One explanation could be that game players anchor their
beliefs, expectations and, consequently, their forecasts on past or initial values. However, this
behaviour would not be consistent with our finding that on average forecasters made revisions
which not only improved on their goals scored forecast errors but which were also excessive.

There are several areas for further research, which could be explored with extensions of
the dataset used here. First, it appears to be a relatively open question as to how sources of
bias among sports forecasters interact with how they make revisions, such as the well-known
favourite-longshot bias. Second, players of the forecasting game studied here do reveal which EPL
team they have the greatest affinity for, though we are yet to observe this information ourselves. It
is an interesting question as to whether any wishful-thinking by the players manifests itself more
greatly before or after they revise their forecasts. Third, an aspect which could be studied from
these current data is whether players improve their forecasts over time, and if they learn how to
play more to the rules of the game itself, which should lead them to favour more conservative goals
forecasts. Fourth, these results concern a selective random sample of players who “completed”
the game. These are likely to be individuals who extract significant utility from making forecasts
of football match scorelines, who are thus more likely to return to their initial forecasts and make
revisions. It would be interesting whether more casual forecasters are better at sticking with their
gut instincts or better off from doing so. Finally, our results suggested an innovation to the game
which could improve the crowd’s forecasting accuracy and which could be easily tested: before
making forecasts, some of the game players could be informed that sticking with their initial
judgement, or gut instinct, is likely to improve their chances of picking a correct score.

Replication of the "Asch Effect" in Bosnia and Herzegovina: Evidence for the Moderating Role of Group Similarity in Conformity

Replication of the "Asch Effect" in Bosnia and Herzegovina: Evidence for the Moderating Role of Group Similarity in Conformity. Muamer Ušto, Saša Drače, Nina Hadžiahmetović. Psychological Topics , Vol 28, No 3 (2019). www.pt.ffri.hr/index.php/pt/article/view/507

Abstract: In the present study, we tried to replicate a classic Asch effect in the cultural context of BosniaHerzegovina and to explore the potential impact of group similarity on conformity. To answer these questions Bosniak (Muslim) students (N = 95) performed classic Asch's line judgment task in the presence of five confederates (the majority) who were ostensibly either of a similar ethnic origin (ingroup), different ethnic origin (out-group) or no salient ethnic origin. The task involved choosing one of three comparison lines that was equal in length to a test line. Each participant went through 18 test trials including 12 critical trials in which confederates provided an obviously wrong answer. In line with past research, the results revealed a clear-cut and powerful "Asch effect" wherein participants followed the majority in 35.4% of critical trials. More importantly, this effect was moderated by group similarity. Thus, in comparison to no salient group identity condition, conformity was maximized in the in-group majority condition and minimized in the out-group majority condition. Taken together, our results support the universal finding of "Asch effect" and provide clear evidence that similarity with the majority plays an important role in the conformity phenomenon.

Keywords: conformity; Asch effect; self-categorization theory; group similarity


Discussion
In line with prior findings (e.g., Nicholson et al., 1985) we replicated the Asch
conformity effect. More than sixty years after Asch originally showed that American
students' judgments in an objective perception task were affected by the erroneous
estimates given by unanimous majority group, Bosnian students were similarly
influenced under the same experimental circumstances. Interestingly, the conformity
in our sample even exceeded the usual level found in other replications of the Asch
experiment (20-30%, cf. Nicholson et al., 1985; Ross, Bierbrauer, & Hoffman, 1976;
Walker & Andrade, 1996). As we can see in Table 1, participants generally followed
the majority in 4 out of 12 critical trials (33.3%). However, when we look only at the
standard condition, where ethnic identity was not salient, we can see that the number
of errors was even bigger, approaching the conformity level obtained by Asch in
similar condition. One reason for these findings could be in cross-cultural differences
on the dimension of individualism-collectivism (Bond & Smith, 1996). In general,
individualistic cultures tend to prioritize independence and uniqueness as cultural
values. Collectivistic cultures, on the other hand, tend to see people as connected
with others and embedded in a broader social context. As such, they tend to
emphasize interdependence, family relationships, and social conformity. Given that
Bosnia and Herzegovina is closer to collectivistic values (probably due to
communism residues) than North America and Western European countries, this
could explain higher levels of conformity in our sample.
Despite this converging evidence in favour of conformity phenomenon, some
authors (e.g., Friend, Rafferty, & Barmel, 1990) pointed out that most people are not
conformists, but that only some individuals tend to conform due to individual
differences in personality. Therefore, it is possible that those conformist personalities
tend to maximize conformity rate, which may also explain the results in our study. If
this hypothesis is true, then the Asch effect should occur only for participants having
conformity disposition, but not for the rest of them: a hypothesis which was
disconfirmed by our results. Indeed, the follow-up analysis conducted without
participants who conformed on each stimulus revealed the overall level of
conformity of 36.29%. Moreover, the fact that 59.2% of subjects conformed at least
at one critical trial indicates that the majority of people exposed to the influence of
others tend to display conformist behaviour. Thus, the results we observed point to
conformity as a rather global phenomenon, which could not be attributed to the
idiosyncratic features of our subjects.

Besides the cross-cultural replication, another important aspect of our study is
that it showed that the Asch effect was clearly moderated by group similarity.
Consistent with the assumption of the SCT (Turner, 1991; Turner et al., 1987),
participants exposed to the in-group majority showed the increase in conformity in
comparison to the standard condition in which group identity was not salient. On the
opposite, when the majority was presented as the out-group, the conformity effect
significantly dropped. Thus, we replicated and extended past research (Abrahams et
al., 1990), showing that self-categorization could play a determining role in
conformity even in more minimal conditions, in which salient in- and out-group
characteristics (i.e., ethnicity) were completely irrelevant for the task at hand. As
such our findings could not be accounted for by the potential differences in objective
informational value (i.e., competence) but rather by the perception of similarity with
the majority. In addition, it should be noted that by including in- and out-groups,
which reflect prototypical ethnic divisions of Bosnian society, we created conditions
that enhanced the ecological validity of the present study. From this point, our
findings could have interesting implications for the understanding of social influence
processes in real life. Indeed, after showing how similarity with particular ethnic
group moderates conformity in clearly unambiguous task, we can easily anticipate
the power of self-categorization process in situations where people have to deal with
more complex and uncertain social reality involving real group interests such as,
support of political decision or voting in the context in which group membership is
highly salient.

Sunday, December 15, 2019

Perception of the bodily cues, interoceptive sensibility (but not interoceptive accuracy), has a significant positive impact on subjective well‐being; a clear exception is gastric sensitivity

Do body‐related sensations make feel us better? Subjective well‐being is associated only with the subjective aspect of interoception. Eszter Ferentzi  Áron Horváth  Ferenc Köteles. Psychophysiology, 2019;e13319, January 10 2019. https://doi.org/10.1111/psyp.13319

Abstract: According to the proposition of several theoretical accounts, the perception of the bodily cues, interoceptive accuracy and interoceptive sensibility, has a significant positive impact on subjective well‐being. Others assume a negative association; however, empirical evidence is scarce. In this study, 142 young adults completed questionnaires assessing subjective well‐being, interoceptive sensibility, and subjective somatic symptoms and participated in measurements of proprioceptive accuracy (reproduction of the angle of the elbow joint), gastric sensitivity (water load test), and heartbeat tracking ability (Schandry task). Subjective well‐being showed weak to medium positive associations with interoceptive sensibility and weak negative associations with symptom reports. No associations with measures of interoceptive accuracy were found. Gastric sensitivity as opposed to heartbeat perception and proprioceptive accuracy moderated the association between interoceptive sensibility and well‐being. Thus, subjective well‐being is associated only with the self‐reported (perceived) aspect of interoception but not related to the sensory measures of interoceptive accuracy.


IAc =interoceptive accuracy

4 | DISCUSSION

In a cross‐sectional study with the participation of young healthy adults, subjective well‐being showed weak‐ to medium‐level associations with interoceptive sensibility even after controlling for gender and negative body‐related sensations (i.e., perceived symptoms). However, no associations with interoceptive accuracy (as assessed by heartbeat tracking ability, gastric sensitivity, and the proprioceptive error with respect to the elbow joint) were found. Moreover, an interaction between interoceptive sensibility and gastric sensitivity was revealed.

The positive association between subjective well‐being and interoceptive sensibility (i.e., the subjective or perceived aspect of interoception) replicates the findings of previous studies (Hanley et al., 2017; Tihanyi, Böőr, et al., 2016; Tihanyi, Sági, Csala, Tolnai, & Köteles, 2016). One explanation is that better psychological functioning and lower levels of perceived stress enable healthy individuals to allocate more attentional resources to various stimuli, including information originating in the body (Köteles et al., 2013). The finding that body‐mind interventions have a positive impact on interoceptive sensibility (Bornemann, Herbert, Mehling, & Singer, 2015; Fissler et al., 2016; Mehling et al., 2013; Rani & Rao, 1994) also supports this idea. It is also possible, however, that a more positive cognitive‐emotional condition simply biases self‐reports in a positive direction (Ferentzi, Drew, et al., 2018). Finally, in accordance with the tenets of body‐mind theorists, paying more attention to the body (i.e., gut feelings, emotions) may also lead to better functioning and improved well‐being (Bakal, 1999; Daubenmier, 2005; Farb et al., 2015; Mehling et al., 2009, 2011). This association might be behaviorally mediated; for example, more focus on body sensations might enable the individual to recognize symptoms of diseases and seek medical help earlier or change potentially risky behaviors in their early phase (Bakal, 1999; Fogel, 2013). However, interoception is a special perceptual process where raw sensory input plays a less salient role in shaping the conscious content than in the case of exteroception (Ádám, 1998). In other words, nonpathological interoceptive sensory information is usually ambiguous, thus its perception of being heavily influenced by top‐down factors such as expectations, previous experiences, environmental cues (Brown, 2004; Friston, 2005; Friston, Kilner, & Harrison, 2006; Pennebaker, 1982). In conclusion, the aforementioned top‐down factors will play a substantial role in the behaviors improving mental and physical health. The strength of the association (interoceptive sensibility explained approximately only 6%–8% of the variance of well‐being) appears realistic; as both constructs are influenced by a number of various factors, a substantially stronger association would be spurious.

Body‐focused attention does not necessarily improve the accuracy of detection of body signals (Ceunen et al., 2013; Silvia & Gendolla, 2001); in other words, there is a considerable dissociation between perceived and actual body‐related events (Ainley & Tsakiris, 2013; Ferentzi et al., 2017; Pennebaker, 1982). For example, subjective somatic symptoms were not related to either indicator of IAc in the current study, which basically reflects the often‐reported independence of symptom reports and body events (van den Bergh, Witthöft, Petersen, & Brown, 2017). Similarly, power posing (i.e., voluntarily adopting powerful postures to improve performance) evoked self‐reported changes in mood but did not influence hormone levels and behavior in risky situations (Ranehill et al., 2015). Although interoceptive sensibility was weakly associated with the cardiac indicators of IAc in our study, IAc did not contribute to subjective well‐ being after controlling for gender, BMI, and resting HR in the regression analysis, and no interaction between interoceptive sensibility and cardioception was revealed. Taking into consideration that the regression analyses were also controlled for somatic symptoms (i.e., sensations from the body that are negative by definition), it can be concluded that the accuracy of detection of interoceptive changes does not have a direct positive or negative impact on well‐being.

The only interaction we found (i.e., gastric sensitivity moderates the association between interoceptive sensibility and well‐being) only partially supports the adaptivity hypothesis, as the contribution of interoceptive sensibility to well‐being is positive only for low and medium levels of gastric sensitivity. According to our result, the interaction between gastric sensitivity and interoceptive sensibility contributes to a higher level of well‐being in the two following cases: firstly, if low to medium gastric sensitivity is accompanied by high interoceptive sensibility, and, secondly, if high gastric sensitivity is accompanied by low interoceptive sensibility. We can only speculate about the interpretation of this result as well as why it was found for gastric sensitivity only. First of all, gastric fullness above a certain level is an unpleasant feeling, which leads to terminating the ongoing food and drink intake. This feeling occurs on a regular basis for everyone, whereas heart‐related and conscious proprioceptive experiences are less frequent under everyday circumstances. Concerning the interpretation of the interaction, high gastric sensitivity can turn the positive association between well‐being and interoceptive sensibility into negative because increased body focus might amplify the unpleasantness of the feeling of distension. This is in accordance with the view that bottom‐up and top‐down processes occur and interact with each other at almost every level of the interoceptive sensory system (Smith & Lane, 2015). Thus, making bodily sensations more conscious might not be beneficial in all cases; it is also an open question, however, whether our finding represents clinical relevance. We would also like to emphasize that this interpretation is speculation only, and the result needs to be confirmed by the replication of the study.

One of the limitations of the current study is that its conclusions are valid for healthy individuals only; atypical interoception may lead to issues in psychological development and represent a general susceptibility to psychopathology (Murphy, Brewer, Catmur, & Bird, 2017). Extremely low and high levels of interoceptive accuracy with respect to one single modality might also have modality‐specific pathological consequences. However, interoceptive accuracy is not a unitary construct (i.e., various interoceptive modalities are independent of each other with respect to IAc; Ferentzi, Bogdány, et al., 2018). This also implies that differences in the accuracy of detection of various bodily cues and modalities within the normal domain can even compete with each other, providing a complex body sensation (Smith & Lane, 2015). Thus, sensitivity with respect to a single channel does not necessarily influence everyday psychological functioning. Interoceptive sensibility, on the other hand, represents a more unitary (i.e., integrated) construct, therefore it may impact self‐reported characteristics such as well‐being.

Issues related to the sensory measurements of interoception have to be mentioned among the limitations of the current study. As IAc is not generalizable across modalities, the current study assessed three interoceptive channels. However, other modalities might be more relevant concerning subjective mental well‐being, such as breathing, the change of heart rate (rather than its actual state), sweating, or the sensation of body temperature change. The context and the interpretation of the bodily cues were also not investigated here, although both might influence self‐rated well‐being. Moreover, the Schandry task has received several criticisms recently and is not considered a reliable indicator of cardioceptive accuracy by some authors (Brener & Ring, 2016; Ring & Brener, 2018). Finally, participants were not screened for mental disorders and chronic conditions that might impact their performance. These issues and the characteristics of the sample (young adult with a relatively high subjective well‐being score) limit the external validity of the findings. In summary, subjective well‐being of healthy young adults is associated with the subjective (perceived) aspect of interoception but not related to interoceptive accuracy. Thus, the level of well‐being depends more on our subjective bodily report than on the actual accuracy of our bodily sensations.

The climate crisis is not just about the environment, but about human rights, justice, & political will; colonial, racist, & patriarchal systems of oppression have created & fueled it; they must be dismantled

Why We Strike Again. Greta Thunberg, Luisa Neubauer, Angela Valenzuela. Project Syndicate, Nov 29, 2019. https://www.project-syndicate.org/commentary/climate-strikes-un-conference-madrid-by-greta-thunberg-et-al-2019-11

Excerpts (emphasis not in the original piece):

After a year of strikes, our voices are being heard. We are being invited to speak in the corridors of power.

With public opinion shifting, world leaders, too, say that they have heard us. They say that they agree with our demand for urgent action to tackle the climate crisis. But they do nothing. As they head to Madrid for the 25th session of the Conference of the Parties (COP25) to the UN Framework Convention on Climate Change, we call out this hypocrisy.

That action must be powerful and wide-ranging. After all, the climate crisis is not just about the environment. It is a crisis of human rights, of justice, and of political will. Colonial, racist, and patriarchal systems of oppression have created and fueled it. We need to dismantle them all. Our political leaders can no longer shirk their responsibilities.


Check also: Greta Thunberg's zeal, as the press summarized her speech at the UN Climate Summit, Sep 23, 2019 https://www.bipartisanalliance.com/2019/09/greta-thunberg-as-press-summarized-her.html

Check also We cannot legislate and spend our way out of catastrophic global warming. Jasper Bernes. Commune, Spring 2019. https://communemag.com/between-the-devil-and-the-green-new-deal/

Disentangling physics from the norms of patriarchal white supremacy must begin with an honest accounting of the roots of the Western scientific project in the project of slavery

Making Black Women Scientists under White Empiricism: The Racialization of Epistemology in Physics. Chanda Prescod-Weinstein. Signs, 2020, vol. 45, no. 2. https://www.journals.uchicago.edu/doi/pdfplus/10.1086/704991


[...] White empiricism is the phenomenon through which only white people (particularly white men) are read has having a fundamental capacity for objectivity and Black people (particularly Black women) are produced as an ontological other. This phenomenon is stabilized through the production and retention of what Joseph Martin calls prestige asymmetry, which explains how social resources in physics are distributed based on prestige. In American society, Black women are on the losing end of an ontic prestige asymmetry whereby different scientists “garner unequal public approbation” in their everyday lives due to ascribed identities such as gender and race (Martin 2017, 475). White empiricism is one of the mechanisms by which this asymmetry follows Black women physicists into their professional lives. Because white empiricism contravenes core tenets of modern physics (e.g., covariance and relativity), it negatively impacts scientific outcomes and harms the people who are othered. [...]


Excerpts of section "Prestige asymmetry and the manufacture of white empiricism"

A scientist using white empiricism as an analytic framework might assume that there is no dynamic relationship between the underrepresentation of Black women and knowledge production in physics, choosing to ignore evidence that the culture of physics limits participation via racist and sexist gatekeeping. Yet Helen Longino (1990) has persuasively argued that, even in the physical sciences, science is social knowledge. Janice Moulton’s “The Adversary Method” (1983) represents one analysis that shows how culture and knowledge production can come into conflict with concrete epistemic implications. Moulton succinctly notes in a section title that in philosophy there is an “unhappy conflation of aggression with success,” and Traweek observes the same among American high energy physicists (Moulton 1983, 149; Traweek 1992, 130). Making aggressive behavior a requirement for academic success is especially harmful to Black women, since Black women are demonized for engaging in behaviors that even hint at aggression (HarrisPerry 2011, 89).

Disentangling physics from the norms of patriarchal white supremacy must begin with an honest accounting of the roots of the Western scientific project in the project of slavery. Slavery is rarely the starting point for discussions of what many of us would call the post–Enlightenment era development of science, which Jonathan Marks helpfully defines as “the production of convincing knowledge in modern society” (2009, 2), but in order to understand the epistemic dismissal of Black women, we must begin with slavery. Science, mathematics, and slavery were intimately connected: whether it was the early evolution of insurance and actuarial science to calculate the value of jettisoned cargo—brutally murdered people—or efforts to minimize the bow wave—the wake—of ships, to make them faster, to speed the movement of kidnapped Africans from the torturous Middle Passage to a tortured lifetime and usually death in the bondage of chattel slavery (Sharpe 2016, 35). Even a century and a half after the end of slavery and with Black intellectuals making inroads in white-dominant academia, they continue to face epistemic injustice, epistemic marginalization, presumed incompetence, and the cognitive dissonance of consciously recognizing the white supremacy that pervades the scientific culture of “no culture” (Traweek 1992, 162).

From 2018... Those in the low opposite-sex exposure condition rated subsequent individual voices of the opposite sex as significantly more attractive than those who were in the high opposite-sex exposure condition

Hearing Sex at the Cocktail Party: Biased Sex Ratios Influence Vocal Attractiveness. John. G. Neuhoff ORCID Icon &Taylor N. Sikich. Auditory Perception & Cognition, Volume 1, 2018 - Issue 1-2, Sep 25 2018. https://doi.org/10.1080/25742442.2018.1518949

ABSTRACT: Visual exposure to unbalanced sex ratios influences perceived facial attractiveness for opposite-sex faces. When opposite-sex faces are scarce they are rated as more attractive than when they are plentiful. The current work examines a vocal-auditory analog of this effect. Participants were assigned to either a high or low opposite-sex vocal exposure condition and reported summary statistics by estimating the percentage of male and female voices in an array of simultaneous talkers. Participants then rated the attractiveness of individual opposite-sex voices. Those in the low opposite-sex exposure condition rated subsequent individual voices of the opposite sex as significantly more attractive than those who were in the high opposite-sex exposure condition. The findings demonstrate that a core visuo-perceptual aspect of mate selection preference also occurs in the auditory domain. The results are consistent with the idea that the attractiveness of opposite-sex partners is an honest signal of fitness and involves multimodal processes that are quickly modulated by the perceived availability of opposite-sex partners in a local environment.

KEYWORDS: Sex ratio, ensemble coding, summary statistics, vocal attractiveness, mate selection

Discussion

Simultaneously sounding voices have historically been treated as “background” stimuli in auditory perception research (Brungart & Simpson, 2007; Brungart, Simpson, Ericson, & Scott, 2001; Cox, Alexander, & Rivera, 1991; Darwin, 2008). However, the current results confirm that when directed to attend to multiple simultaneous voices, listeners can use ensemble coding to extract summary statistics and scale the percentage of male and female voices in the array (Neuhoff, 2017). Moreover, when listeners hear a low percentage of opposite-sex voices, subsequent individual opposite sex voices are perceived as more attractive than when they hear a high percentage of opposite-sex voices.

Sex Ratios and Vocal Attractiveness
The effect of unbalanced sex ratios on perceived attractiveness is consistent with previous work that examines the relationship between sex ratios and mate selection behavior. Favorable sex ratios (a larger choice of potential opposite-sex mates and fewer same-sex rivals) are associated with choosier mate selection behaviors and raised standards of attractiveness in a potential mate (Hahn et al., 2014; Munro et al., 2014; Watkins et al., 2012). From a theoretical perspective, modulating mate selection preferences and behaviors based on the perception of unbalanced sex ratios makes evolutionary sense. Sociosexual behaviors in populations with biased sex ratios skew toward the preferences of the minority sex, which can be more selective because they face less competition from same-sex rivals (Moss & Maner, 2016; Pedersen, 1991; Pollet & Nettle, 2008; Schmitt, 2005). Lowering attractiveness standards in the face of unfavorable sex ratios is a behavior that expands the pool of potential mates (Watkins et al., 2012). The current findings for unbalanced vocal sex ratios are consistent with research on sex ratios and facial attractiveness and provide converging support for a reliable relationship between vocal and visual attractiveness (Abend et al. 2015; Puts et al., 2016).

This suggests that observers use multimodal sources of information when evaluating potential opposite-sex partners and that the process may involve a high degree of automaticity. For example, Mileva, Tompkinson, Watt, and Burton (2018) showed that impression formation involves a mandatory and immediate integration of both vocal and facial information. Future work might examine the degree to which the perception of summary statistics from voices and the effects of unbalanced sex ratios on attractiveness involve automatic processes. In the current work, listeners accurately scaled sex ratios after exposures of only 1500 ms and showed effects of unbalanced sex ratios on perceived attractiveness after cumulative exposure of only 1.2 min (48 trials × 1500 ms). We also found a main effect for the number of voices presented in the exposure phase. Listeners presented with 5 simultaneous voices perceived subsequent individual voices to be more attractive than those first presented with 10 simultaneous voices. Although we did not specifically ask our participants to report the number of voices in the exposure stimuli, the results are consistent with the overarching hypothesis that standards of attractiveness will be lowered (i.e., voices will be rated as more attractive) when the number of potential opposite-sex partners is diminished.

Finally, we found a main effect for participant sex that indicated men found female voices more attractive than women found male voices. This finding could simply be a function of the relative attractiveness between male and female voices in our study.

Ericson, & Scott, 2001; Cox, Alexander, & Rivera, 1991; Darwin, 2008). However, the current results confirm that when directed to attend to multiple simultaneous voices, listeners can use ensemble coding to extract summary statistics and scale the percentage of male and female voices in the array (Neuhoff, 2017). Moreover, when listeners hear a low percentage of opposite-sex voices, subsequent individual opposite sex voices are perceived as more attractive than when they hear a high percentage of opposite-sex voices. Sex Ratios and Vocal Attractiveness The effect of unbalanced sex ratios on perceived attractiveness is consistent with previous work that examines the relationship between sex ratios and mate selection behavior. Favorable sex ratios (a larger choice of potential opposite-sex mates and fewer same-sex rivals) are associated with choosier mate selection behaviors and raised standards of attractiveness in a potential mate (Hahn et al., 2014; Munro et al., 2014; Watkins et al., 2012). From a theoretical perspective, modulating mate selection preferences and behaviors based on the perception of unbalanced sex ratios makes evolutionary sense. Sociosexual behaviors in populations with biased sex ratios skew toward the preferences of the minority sex, which can be more selective because they face less competition from same-sex rivals (Moss & Maner, 2016; Pedersen, 1991; Pollet & Nettle, 2008; Schmitt, 2005). Lowering attractiveness standards in the face of unfavorable sex ratios is a behavior that expands the pool of potential mates (Watkins et al., 2012).

The current findings for unbalanced vocal sex ratios are consistent with research on sex ratios and facial attractiveness and provide converging support for a reliable relationship between vocal and visual attractiveness (Abend et al. 2015; Puts et al., 2016). This suggests that observers use multimodal sources of information when evaluating potential opposite-sex partners and that the process may involve a high degree of automaticity. For example, Mileva, Tompkinson, Watt, and Burton (2018) showed that impression formation involves a mandatory and immediate integration of both vocal and facial information. Future work might examine the degree to which the perception of summary statistics from voices and the effects of unbalanced sex ratios on attractiveness involve automatic processes. In the current work, listeners accurately scaled sex ratios after exposures of only 1500 ms and showed effects of unbalanced sex ratios on perceived attractiveness after cumulative exposure of only 1.2 min (48 trials × 1500 ms). We also found a main effect for the number of voices presented in the exposure phase. Listeners presented with 5 simultaneous voices perceived subsequent individual voices to be more attractive than those first presented with 10 simultaneous voices. Although we did not specifically ask our participants to report the number of voices in the exposure stimuli, the results are consistent with the overarching hypothesis that standards of attractiveness will be lowered (i.e., voices will be rated as more attractive) when the number of potential opposite-sex partners is diminished.

Finally, we found a main effect for participant sex that indicated men found female voices more attractive than women found male voices. This finding could simply be a function of the relative attractiveness between male and female voices in our study. However, it is also a finding that occurs consistently when men and women are asked to give opposite-sex attractiveness ratings (Gladue & Delaney, 1990; Hahn et al., 2014; Johnco et al., 2010) and is consistent with a higher priority in men than in women for physical attractiveness as an important criterion for mate selection (Boxer et al., 2015; Buss, 1989; Buss & Barnes, 1986).

Effect Sizes
We found very large effects sizes between conditions when listeners were asked to judge the percentage of males and females in our multiple voice exposure stimuli. The effect size for the linear trend for perceived sex ratio as a function of actual sex ratio was ηp 2 = .42 (equivalent to Cohen’s d = 1.7). Neuhoff (2017) also found large effect sizes when participants were asked to scale vocal sex ratios that ranged from 0% to 100%. The size of the effect speaks to the robust ability of listeners to scale sex ratios of multiple simultaneous voices.

However, even effect sizes this large likely underestimate the true effect size that might occur in more natural environments. Under natural listening conditions, multiple simultaneous talkers emanate from separate locations in space (rather than centrally from headphones or loudspeakers). Spatial separation of talkers reduces auditory cognitive load and affords a better assessment of target speech among multiple talkers (Andeol, Suied, Scannella, & Dehais, 2017; Bronkhorst, 2000; Shinn-Cunningham, Ihlefeld, Satyavarta, & Larson, 2005). Thus, spatial separation might also afford more accurate estimates of sex ratios. In a similar light, it may also be the case that durations of exposure to multiple voices longer than 1500 ms would provide a better assessment of vocal sex ratios.

In contrast to the large effect sizes for scaling sex ratios, the effect size for the difference in attractiveness ratings between high and low opposite sex exposure conditions was comparatively small (ηp 2 = .02, equivalent to Cohen’s d = .29). Our design had sufficient power to detect this effect size, and it may be that the factors of increased spatial separation and stimulus duration that would occur in a natural environment would also increase the effects of unbalanced sex ratios on attractiveness. The fact that exposure and attractiveness ratings occurred in temporally separate blocks may also contribute to the smaller observed effect size.

However, effect sizes need not be large to be important from an evolutionary perspective. On the contrary, small but reliable effect sizes can be instrumental in explaining how our evolutionary history shaped current perceptual and cognitive abilities (Voyer, Voyer, & Bryden, 1995; Weiss, Kemmler, Deisenhammer, Fleischhacker, & Delazer, 2003; Zilles et al., 2016). For example, in evolutionary psychology, finding sex differences can be critically important evidence that supports a behavioral adaptation. Yet, a meta-analysis of 286 studies on sex differences in spatial perception showed a mean effect size of only d = .37 (ηp 2 = .03; Voyer et al., 1995). Although such small effect sizes are not helpful in predicting the behavior of any particular individual based on sex, they are indicative of differential challenges faced by men and women over the course of evolutionary history. The effect size in our results is also similar to that found for the effect of biased sex ratios on facial attractiveness (ηp 2 = .02, Hahn et al., 2014).

Limitations and Future Research
Our sample included only heterosexual participants. Thus, it is an open question as to how exposure to unbalanced sex ratios might influence participants of other sexual orientations or how participant sexual orientation might interact with the orientation of the to-be-judged talker. Although our results do not speak to these questions, there is considerable evidence to suggest that sexual orientation is likely an important factor in these kinds of investigations and could be a fruitful avenue for further research (Hancock & Pool, 2017; Munson, 2007; Rule, 2017; Valentova, Roberts, & Havlicek, 2013).

The online nature of our data collection introduced variability that might not have been present under more controlled laboratory conditions. For example, participants listened to the stimuli as compressed mp3 files on their own devices at different levels with varying amounts of background noise in each unique listening environment. Nonetheless, all these factors introduce variability that makes it less likely to reject the null hypothesis. Finding significant results in the face of this increased variability speaks to the robust nature of the effects and increases the external validity of the findings.

Online data collection also resulted in a more diverse sample than what we would expect to obtain in typical undergraduate samples. While this is a desirable characteristic of samples, the mean age of our participants (39 years) was considerably older than that of the talkers whose voices were rated for attractiveness (20 years). Although this poses no threat to internal validity (all participants rated voices of the same age), it would be interesting to examine how participant and talker age interact in future studies of sex ratios and attractiveness.

Overconfident people should be surprised that they are so often wrong. Are they?

Overprecision Increases Subsequent Surprise. Derek Schatz, Don A. Moore. bioRxiv, December 13, 2019. https://doi.org/10.1101/2019.12.13.875203

Abstract: Overconfident people should be surprised that they are so often wrong. Are they? Three studies examined the relationship between confidence and surprise in order to shed light on the psychology of overprecision in judgment. Participants reported ex-ante confidence in their beliefs, and after receiving accuracy feedback, they then reported ex-post surprise. Results show that more ex-ante confidence produces less ex-post surprise for correct answers; this relationship reverses for incorrect answers. However, this sensible pattern only holds for some measures of confidence; it fails for confidence-interval measures. The results can help explain the robust durability of overprecision in judgment.

GENERAL DISCUSSION
Our results show that ex-ante confidence and ex-post surprise are inextricably linked. Our primary finding is that when people are correct, greater ex-ante confidence produces less ex-post surprise, whereas when they are incorrect, greater ex-ante confidence produces more ex-post surprise. We examine the psychology underlying these relationships and identify moderators that can either suppress or enhance their strength. Studies 1 and 2 establish the link between confidence and surprise, highlighting that correctness is a powerful moderator of the relationship. Studies 2 and 3 employ exogenous manipulations of confidence; their results replicate the correlational results of Study 1. Study 2 finds more powerful confidence-correctness interaction effects on surprise for epistemic questions than for aleatory, consistent with the notion that feeling personally accountable for knowing or not knowing the answer increases the intensity of emotional reactions to being right or wrong. Study 3 finds that people are more surprised about being wrong than they expect to be. 

What of the utility of surprise? If surprise reflects prediction error, individuals should seek to maximize accuracy and minimize surprise (Ely, Frankel, & Kamenica, 2015). This implies that surprise should lead people to reduce their subsequent confidence. Our results suggest that surprise does not always play this functional role, or that it is difficult to measure consistently. Future research should examine the conditions under which surprise has a corrective effect on subsequent confidence. How quickly does this effect decay and what possible moderators could increase the calibrating power and longevity of feedback on subsequent confidence? Could incorrect answers in epistemic domains more central to one’s self-concept ‘stick’ for a longer period of time, forcing one’s re-evaluation of their believed expertise? Or could the opposite be the case, where the incorrect answer is considered anomalous and the sense of expertise persists?

We aspired to measure the effects of overprecision on surprise. In recording participants’ ex-ante confidence, their correctness, and their ex-post surprise, we document consistent evidence suggesting that people expect to be correct. If they go into a decision with confidence, they are more surprised to be incorrect, and less surprised when correct. We believe these results do more than underscore precision in judgment. Rather, this research approaches the topic with a new paradigm that serves to reveal another layer in the scientific understanding of the psychology of confidence and precision in judgment.