Monday, December 9, 2019

Men Are Funnier than Women under a Condition of Low Self-Efficacy but Women Are Funnier than Men under a Condition of High Self-Efficacy

Men Are Funnier than Women under a Condition of Low Self-Efficacy but Women Are Funnier than Men under a Condition of High Self-Efficacy. Tracy L. Caldwell, Paulina Wojtach. Sex Roles, December 9 2019. https://link.springer.com/article/10.1007/s11199-019-01109-w

Abstract: The debate about whether women can be as funny as men pervades the popular press, and research has sometimes supported the stereotype that men are funnier (Mickes et al. 2011). The goal of the present research was to determine whether this gender difference can be explained by differences in beliefs about one’s capability for humor (“humor self-efficacy”). Male and female U.S. undergraduates (n = 64) generated captions for 20 cartoons and rated their own humor self-efficacy. Subsequently, an independent sample of 370 Amazon Mechanical Turk (MTurk) users evaluated these captions in a knockout-style tournament in which pairs of captions were presented with each of the cartoons. Each participant was randomly assigned to evaluate captions which were authored by men and women selected to be either low or high in humor self-efficacy. In the initial round of the tournament, each caption was authored by a man and a woman matched for comparable levels of self-identified humor self-efficacy. In subsequent rounds, the remaining captions were paired randomly. MTurk users, unaware of the captioners’ gender, selected the captions of men as funnier only under the low self-efficacy condition and those of women as funnier under the high self-efficacy condition. These data suggest that self-efficacy may be a critical determinant of the successful performance of humor. When people say that women are not funny, they may be relying on an unfounded stereotype. We discuss how this stereotype may negatively affect perceptions of women in the workplace and other settings.

Keywords: Humor Cartoons Human sex differences Sex roles Stereotyped attitudes Sex role attitudes
Men Are Funnier than Women under a Condition of Low Self-Efficacy but Women Are Funnier than Men under a Condition of High Self-Efficacy. Tracy L. Caldwell, Paulina Wojtach. Sex Roles, December 9 2019. https://link.springer.com/article/10.1007/s11199-019-01109-w

Abstract: The debate about whether women can be as funny as men pervades the popular press, and research has sometimes supported the stereotype that men are funnier (Mickes et al. 2011). The goal of the present research was to determine whether this gender difference can be explained by differences in beliefs about one’s capability for humor (“humor self-efficacy”). Male and female U.S. undergraduates (n = 64) generated captions for 20 cartoons and rated their own humor self-efficacy. Subsequently, an independent sample of 370 Amazon Mechanical Turk (MTurk) users evaluated these captions in a knockout-style tournament in which pairs of captions were presented with each of the cartoons. Each participant was randomly assigned to evaluate captions which were authored by men and women selected to be either low or high in humor self-efficacy. In the initial round of the tournament, each caption was authored by a man and a woman matched for comparable levels of self-identified humor self-efficacy. In subsequent rounds, the remaining captions were paired randomly. MTurk users, unaware of the captioners’ gender, selected the captions of men as funnier only under the low self-efficacy condition and those of women as funnier under the high self-efficacy condition. These data suggest that self-efficacy may be a critical determinant of the successful performance of humor. When people say that women are not funny, they may be relying on an unfounded stereotype. We discuss how this stereotype may negatively affect perceptions of women in the workplace and other settings.

Keywords: Humor Cartoons Human sex differences Sex roles Stereotyped attitudes Sex role attitudes
Men Are Funnier than Women under a Condition of Low Self-Efficacy but Women Are Funnier than Men under a Condition of High Self-Efficacy. Tracy L. Caldwell, Paulina Wojtach. Sex Roles, December 9 2019. https://link.springer.com/article/10.1007/s11199-019-01109-w

Abstract: The debate about whether women can be as funny as men pervades the popular press, and research has sometimes supported the stereotype that men are funnier (Mickes et al. 2011). The goal of the present research was to determine whether this gender difference can be explained by differences in beliefs about one’s capability for humor (“humor self-efficacy”). Male and female U.S. undergraduates (n = 64) generated captions for 20 cartoons and rated their own humor self-efficacy. Subsequently, an independent sample of 370 Amazon Mechanical Turk (MTurk) users evaluated these captions in a knockout-style tournament in which pairs of captions were presented with each of the cartoons. Each participant was randomly assigned to evaluate captions which were authored by men and women selected to be either low or high in humor self-efficacy. In the initial round of the tournament, each caption was authored by a man and a woman matched for comparable levels of self-identified humor self-efficacy. In subsequent rounds, the remaining captions were paired randomly. MTurk users, unaware of the captioners’ gender, selected the captions of men as funnier only under the low self-efficacy condition and those of women as funnier under the high self-efficacy condition. These data suggest that self-efficacy may be a critical determinant of the successful performance of humor. When people say that women are not funny, they may be relying on an unfounded stereotype. We discuss how this stereotype may negatively affect perceptions of women in the workplace and other settings.

Keywords: Humor Cartoons Human sex differences Sex roles Stereotyped attitudes Sex role attitudes
Men Are Funnier than Women under a Condition of Low Self-Efficacy but Women Are Funnier than Men under a Condition of High Self-Efficacy. Tracy L. Caldwell, Paulina Wojtach. Sex Roles, December 9 2019. https://link.springer.com/article/10.1007/s11199-019-01109-w

Abstract: The debate about whether women can be as funny as men pervades the popular press, and research has sometimes supported the stereotype that men are funnier (Mickes et al. 2011). The goal of the present research was to determine whether this gender difference can be explained by differences in beliefs about one’s capability for humor (“humor self-efficacy”). Male and female U.S. undergraduates (n = 64) generated captions for 20 cartoons and rated their own humor self-efficacy. Subsequently, an independent sample of 370 Amazon Mechanical Turk (MTurk) users evaluated these captions in a knockout-style tournament in which pairs of captions were presented with each of the cartoons. Each participant was randomly assigned to evaluate captions which were authored by men and women selected to be either low or high in humor self-efficacy. In the initial round of the tournament, each caption was authored by a man and a woman matched for comparable levels of self-identified humor self-efficacy. In subsequent rounds, the remaining captions were paired randomly. MTurk users, unaware of the captioners’ gender, selected the captions of men as funnier only under the low self-efficacy condition and those of women as funnier under the high self-efficacy condition. These data suggest that self-efficacy may be a critical determinant of the successful performance of humor. When people say that women are not funny, they may be relying on an unfounded stereotype. We discuss how this stereotype may negatively affect perceptions of women in the workplace and other settings.

Keywords: Humor Cartoons Human sex differences Sex roles Stereotyped attitudes Sex role attitudes


Discussion

In replication of findings by Mickes et al. (2011) and
Greengross and Miller (2011), but contrary to those by
Howrigan and MacDonald (2008), our participants found
men to be, on average, more skilled at captioning New
Yorker cartoons than women. This difference was qualified,
as hypothesized, by the humor self-efficacy of the captioner;
male captioners were funnier in the low self-efficacy condition
and female captioners were funnier in the high self-efficacy
condition. Participants’ response to the question, “Who is funnier?,”
which can be interpreted as an implicit measure of
whether they endorse the stereotype that women are not funny,
varied by sample: The most frequent answer among
captioners in Phase 1, whose participants were all undergraduates, was
“neither” (42.2%), followed by “men” (38.2%),
then women (“19.6%”). This pattern was the same for men
and women. Those who rated the captions in Phase 2 were a
more age-diverse sample of MTurk users, and their answers
were “men” (44.7%), followed by “neither” (37.9%), then
“women” (17.3%) and these responses were heavily influenced by participants’
gender, such that those who responded
that men are funnier were overwhelmingly male, whereas
those who said women were funnier were overwhelmingly
female. However, contrary to Mickes et al.’s (2011) findings,
raters’self-reported preference for men’s humor did not cause
them to show a gendered preference for men’s humor greater
than that of women. In other words, although men (but not
women) told us that men are funnier, their preferences for
men’s humor was no greater than that of women, and it was
not in evidence when rating the captions of women who were
higher in self-efficacy.
We anticipated that men’s superiority at captioning New
Yorker cartoons would lessen under conditions of high self-efficacy
on the basis that the male and female captioners in
Mickes et al.’s (2011) study differed in their self-reported self-confidence.
We argue that this gender difference in self-confidence could
also account for the gender differences in
humor observed in other studies (Greengross and Miller 2011;
Howrigan and MacDonald 2008). When women internalize a
culture’s prescriptions against using humor, they chronically
operate under conditions of low self-efficacy. Our data suggest
that men and women are matched in their ability to perform
humor, but that their self-beliefs and the contexts that prime
these self-beliefs may influence its skilled performance.
Consider a compelling example of this reasoning in research
by Hull et al. (2016): Their participants were asked to produce
humor under one of two instructional cues, to either be funny
or to be catchy. All participants performed better when told to
be catchy, but women outperformed men when told to be
funny. Context mattered and confidence may be key. Also
consider research by Hooper et al. (2016), in which they asked
undergraduates from Britain, Canada, and Australia to rate
captions submitted to the New Yorker’s captioning contest by
men and women wishing to test their comic mettle. There
were no differences in the rated funniness of the captions,
except in Britain, where women’s captions were favored over
men’s. Their data show that when women have the confidence
to self-select into a captioning task, they perform at least
equally as well as men in some cultural contexts and better
in others for the same jokes.
The goal of the present study was to investigate whether
one could create a circumstance under which women could be
capable of performing humor nearly as well as men. The answer,
surprisingly, was that under conditions of high self-efficacy,
women were even more capable than men, a finding that
is “surprising” in light of the ubiquity of messages prohibiting
women’s performance of humor, including this comment on
the YouTube trailer for the all-female Ghostbusters reboot
(Sony Pictures Entertainment 2016): “Ok so I read the description
and I noticed something strange, it says ‘rebooted
with a new cast of hilarious characters’....Did these hilarious
character just not make it into the final movie or did I miss
something?” It could be easy to dismiss this kind of prejudice,
given that it is directed to four superstar comics whose careers
have fared quite well in spite of detractors. However, there are
contexts in which prejudice against women’s humor can be
more widely consequential. For example, Decker and
Rotondo (2001) found, in the workplace, that women who
used negative humor (sexual and offensive humor) were
judged as being less effective leaders than men who used the
same kind of humor and that this finding emerged even when
controlling for the gender of the respondent. When the humor
was positive, on the other hand, women were rated higher on
relationship behaviors and effectiveness than men. Their data
suggest that both men and women hold implicit beliefs about
humor and gender roles and that women’s financial well-being
is at greater risk than men’s when they use sexual and offensive humor.
Another place in which humor and gender are performed
with some risk is in romantic attraction. In the context of
heterosexual attraction, when it comes to humor, men have
indicated they prefer women who will laugh at their jokes to
those who will produce their own humor (Bressler et al. 2006;
Hone et al. 2015). Other research indicates that a woman’s use
of humor is, at best, irrelevant to her potential mate value (i.e.,
when given the choice between a woman who uses humor to
one who does not, they show no clear preference; Bressler and
Balshine 2006; Wilbur and Campbell 2011) and at worst, it
makes women less attractive (Lundy et al. 1998). Women’s
use of humor, then, may be the basis for rejection at the box
office, in the workplace, and in the dating context.
To be clear, there are just as many studies demonstrating
that men and women do not differ in their interest in humorous
partners in the context of romantic relationships (i.e., not all
men find humor unattractive; Buss 1988; DiDonato et al.
2013; Feingold 1992; Kenrick et al. 1990; Treger et al.
2013). Likewise, there are movie trailers starring all-female
casts of comics that are not trolled hard. For example, whereas
Sony Pictures Entertainment’s (2016) official Ghostbusters
trailer has nearly 300,000 YouTube comments, largely negative, Universal Pictures’s (2011) official Bridesmaids trailer
has under 1000. Perhaps the critical difference is that
Ghostbusters treads on hallowed male comics’ ground (the
original starred comedy darlings Bill Murray, Dan Aykroyd,
and Harold Ramis) and Bridesmaids is a “chick flick.” Similar
hallowed ground was encroached upon in the newest take on
the comedy-heist “Oceans” series, Ocean 8, starring an allfemale cast,
whose YouTube trailer (Warner Bros. Pictures
2018) comments include, “Oh great, another all-female reboot,
‘cause Ghostbusters turned out great” and “Coming
soon: No Country for Old Women.” These examples suggest
that the answer to the question “Should women use humor?”
is: “Sure, as long as it is performed in qualitatively distinct
contexts from which men perform it.” Granted, these comments represent
the sentiments of a highly self-selected sample, thus, in future
research, one might explore just how representative
they are. Recall, however, that men’s selfprofessed preference for men’s over women’s humor was
not matched, in our sample of raters, with a measured preference for
men’s cartoon captions greater than that of women’s
cartoon captions. We take these data as preliminary evidence
that gender bias shaped men’s stated preferences.
Limitations
Overall, our data would seem to specify that the circumstance
under which men are likely to “out-humor” women is when
women’s self-efficacy is low. One shortcoming of our study is
that we cannot determine the provenance of our captioners’
self-efficacy: Did we create experimental contexts that caused
differences in humor self-efficacy or were these differences
pre-existing? Results from the caption-generation phase indicated we
were not able to demonstrate, at the level of the entire
sample, that we had successfully manipulated self-efficacy, so
we “cherry-picked” our captioners to create quasiexperimental
conditions of low and high self-efficacy, to determine if differences
in captioners’ self-efficacy were detectable by an independent
sample of raters. The quasiexperimental nature of our study makes it difficult to rule
out raw humor talent as a third variable that could simultaneously
explain captioners’ self-efficacy and their humorousness. That said,
this third variable explanation cannot explain
why men were perceived as funnier in the low self-efficacy
condition and women as funnier in the high self-efficacy condition; if
raw talent were the cause of captioners’self-efficacy
ratings and their humorousness, then it should have had the
effect of equalizing humor ability across gender. Nevertheless,
if our paradigm were to be used again, the manipulation of
self-efficacy would need to be strengthened to determine its
causality.
It is possible that we did in fact successfully manipulate
self-efficacy but that our assessment of it was not sensitive
enough. Bandura (1977) recommends a microanalytic assessment strategy
in which one assesses self-beliefs about a
targeted and objective behavioral outcome in a particular domain (e.g.,
“How confident are you that you can correctly
answer seven of these ten questions about global climate
change?”). Our assessment of humor self-efficacy consisted
of asking participants “How funny do you think others will
find your cartoon captions?” In hindsight, our question may
not have assessed self-beliefs so much as it did their beliefs
about others’ perceptions of their humor. An individual can
simultaneously be very confident that she can crack jokes that
she will find hilarious while recognizing that the average person
might not appreciate her attempts at humor. Moreover, our
question does not ask them about a behavioral outcome that
can be objectively measured. This is due in part to the subjectivity
of humor. Nevertheless, a rephrasing that comes closer
to Bandura’s original recommendation is: “How confident are
you that you can write at least three [or four or five, etc.]
captions that others with a similar sense of humor might find
funny?” A better assessment of self-efficacy is critical for
determining its role in explaining gender differences in humorous performance.
In short, participants within each of our quasi-experimental
conditions of the captioning phase shared something in common that
led to differences in their humorous performance.
We believe that self-efficacy was what caused variations in
their humorous performance, but we cannot completely rule
out raw talent as a third variable. A more effective manipulation
and assessment of self-efficacy is warranted.

Witnessing fewer credible cultural cues of religious commitment is the most potent predictor of religious disbelief, β=0.28, followed distantly by reflective cognitive style

Gervais, Will M., Maxine B. Najle, Sarah R. Schiavone, and Nava Caluori. 2019. “The Origins of Religious Disbelief: A Dual Inheritance Approach.” PsyArXiv. December 8. doi:10.31234/osf.io/e29rt

Abstract: Religion is a core feature of human nature, yet a comprehensive evolutionary approach toreligion must account for religious disbelief. Despite potentially drastic overreporting of religiosity[1], a third of the world’s 7+ billion human inhabitants may actually be atheists—merely people who do not believe in God or gods. The origins of disbelief thus present a key testing ground for theories of religion. Here, we evaluate the predictions of three prominent theoretical approaches to the origins of disbelief, and find considerable support for dual inheritance (gene-culture coevolution) approach. This dual inheritance model[2,3] derives from distinct literatures addressing the putative 1) core social cognitive faculties that enable mental representation of gods[4–7], 2) the challenges to existential security that motivate people to treat some god candidates as real and strategically important[8,9], 3) evolved cultural learning processes that influence which god candidates naïve learners treat as real rather than imaginary[3,10–12], and4) the intuitive processes that sustain belief in gods[13–15] and the cognitive reflection that may sometimes undermine it[16–18]. We explore the varied origins of religious disbelief by analyzing these pathways simultaneously in a large nationally representative (USA, N= 1417) dataset with preregistered analyses. Combined, we find that witnessing fewer credible cultural cues of religious commitment is the most potent predictor of religious disbelief, β=0.28, followed distantly by reflective cognitive style, β= 0.13, and less advanced mentalizing, β= 0.05. Low cultural exposure to faith predicted about 90% higher odds of atheism than did peak cognitive reflection. Further, cognitive reflection predicted reduced religious belief only among individuals who witness relatively fewer credible contextual cues of faith in others. This work empirically unites four distinct literatures addressing the origins of religious disbelief, highlights the utility of considering both evolved intuitions and cultural evolutionary processes in religious transmission, emphasizes the dual roles of content- and context-biased social learning[19], and sheds light on the shared psychological mechanisms that underpin both religious belief and disbelief.

Factors Predicting Religious (Dis)belief

To assess the four different factors that may drive religious disbelief, we measured participants’ mentalizing abilities, feelings of existential security, exposure to credible cues of religiosity (CREDs), and reflective versus intuitive cognitive style.

We measured advanced mentalizing abilities, which correspond to mindblind atheism, using the Perspective Taking 293 Subscale of the Interpersonal Reactivity Index75. This measure includes items like “I try to look at everybody’s side of a disagreement before I make a decision” and “Before criticizing somebody, I try to imagine how I would feel if I were in their place,” measured on a scale from 1 (strongly disagree) to 7 (strongly agree). This scale reached an acceptable level of reliability, α = 0.77, M = 4.79, SD = 0.78.  We measured feelings of existential security, which corresponds to apatheism, with a number of items assessing concerns that are salient to participants and participant faith in institutions like the government, health care, and social security to provide aid in the face of need44. Items about the salience of different concerns included questions about how often participants worry about losing their job, worry about having enough money in the future, and feel they cannot afford things that are necessary. These items were assessed on a scale from 1 (never) to 4 (all the time). Illustrative items regarding faith in institutions include “How much do you feel confident in our country’s social security system” and “How much do you feel that people who start out poor can become wealthy if they work hard enough,” assessed on a scale from 1 (not at all) to 4 (a lot). Items measuring faith in institutions were reverse-scored, and all items were averaged together to form a composite index of existential insecurity (α = 0.77, M = 2.2, SD = 0.39), with higher scores reflecting more insecurity.

We measured cognitive reflection, which corresponds to analytic atheism, using nine items from the Cognitive Reflection Test76–78. This measure poses a series of questions to participants that rely on logical reasoning to answer correctly. All have a seemingly simple initial answer, but upon further consideration people arrive at a different (and correct) answer. We therefore measured whether or not participants provided the correct answers to these questions that require more cognitive reflection. If they answered a question correctly, they were given a 1, and if they answered it incorrectly, they were given a 0. Our full index of cognitive reflection is composed of the sum of the number of questions that each participant answered correctly, with a higher score thus indicating a more reflective and analytic cognitive style. The average score was 3.18, with a standard deviation of 2.66. We measured exposure to CREDs, which corresponds to inCREDulous 317 atheism, with the CREDs Scale10.

This scale assesses the extent to which caregivers demonstrated religious behaviors during the respondent’s childhood, such as going to religious services, acting as good religious role models, and making personal sacrifices to religion. The frequency of these types of behaviors was measured on a scale ranging from 1 (never) to 4 (always). This scale was highly reliable, α = 0.93, M = 2.42, SD = 0.84.

Sunday, December 8, 2019

NL: Occupation with the highest life satisfaction was ship/aircraft controller; lowest life satisfaction was in forestry; highest for women was creative & performing artist, for men it was keyboard operator

Van Leeuwen, J. & Veenhoven, R.  Would I be happier as a teacher or a carpenter? Erasmus Happiness Economics Research Organization EHERO, Working Paper 2019/4. https://personal.eur.nl/veenhoven/Pub2010s/2019n-full.pdf

ABSTRACT: Most people are looking for ways to make their life as happy as possible. Since we work a great part of our life time, it is worth knowing which occupations will bring us the most happiness and which will bring the least. This requires information on how happy people are in different occupations and in particular, what kinds of people are the happiest in what kinds of occupation. We sought answerers to these questions using data taken from the WageIndicator for 2006 to 2014 for the Netherlands. The large dataset of 160.806 respondents made it possible to assess differences in happiness levels in 130 occupations and to split the results across 4 personal characteristics. The occupation in the Netherlands with the highest life satisfaction was ship, aircraft controller and technician working in this field. The occupation in the Netherlands with the lowest life satisfaction was forestry and related work. The occupation giving the most life satisfaction for women was creative and performing artist, for men it was keyboard operator.

Key words: Happiness, Life-satisfaction, Occupational choice

4.2 Further research along this line

Replication on a more representative dataset
This research can be repeated with more recent data and with a more representative data set e.g. using the workforce survey of Statistics Netherlands (CBS, Dutch Labour Force Survey (LFS), 2019) This will reduce the effect of self-selection and remove the effects of the economic recession.

Replication on a larger dataset
The WageIndicator provides not only data for the Netherlands but has information for 93 countries (WageIndicator, 2019). Pooling data obtained in other developed countries will produce a much larger dataset than used here, which will allows us to consider more specific kinds of people.
Subsequent research can provide insights into cross-national differences in happiness across occupations around the world.

Replicate on job-satisfaction
It is also possible to investigate job-satisfaction in the same way we have investigated life-satisfaction in this study. Job-satisfaction in the case of the data of the WageIndicator needs to be transform to an equal scale with life-satisfaction, to make it possible to investigate its cohesion, or lack of, with life-satisfaction. In the case of job-satisfaction we also need to investigate further personal scales next to personal characteristics, incombination with the scales needed for each occupation.

Assess difference between job-satisfaction and life-satisfaction
Above in section 2.1, we noted that it is easier to estimate the degree of job-satisfaction one will experience in a particular occupation than to predict how that occupation will affect one’s wider life-satisfaction. It is therefore worth getting a view on the differences that exist between jobsatisfaction and life-satisfaction in occupations. Are there substantial differences? If so, which occupations provide more job-satisfaction than life-satisfaction and which more life-satisfaction than job-satisfaction?
A specific group to be considered in this context, are the people who work as entrepreneurs. Of course, a distinction can be made between various types of entrepreneurship. For example, self-employed entrepreneurs and family entrepreneurs. This group of workers has not been examined, but would be a good topic for future research, that is to look at the life-satisfaction and job-satisfaction of different entrepreneurs.

Assess effect of job characteristics
Receiving direct feedback from peers, customers, patients, students or engineered devices might lead to a higher life satisfaction compared to a more indirect feedback when actions taken do not provide feedback and one would have to rely on one’s own judgement of the quality of the output delivered. Besides, the variation of this effect over the course of one’s career could also be assessed.

Assess effect of pay
Based on the results presented here it is possible to look at occupations in a differed way, we can now look both at income-based results of work and, importantly at the effect of types of occupation on life-satisfaction. This means that it becomes possible to see payment as a compensation for lower life satisfaction, a new way to look at our working lives’.

Acupuncture—A Question of Culture... and of skill to induce impression of being effective :-)

Acupuncture—A Question of Culture. Matthias Karst, Changwei Li. JAMA Netw Open. 2019;2(12):e1916929. Dec 6 2019, doi:10.1001/jamanetworkopen.2019.16929

By the end of radiotherapy for head and neck cancers, more than 50% of patients experience radiation-induced xerostomia (RIX), a condition manifested by a long-lasting perception of dry mouth. Radiation-induced xerostomia is associated with a series of complications, such as difficulty sleeping and speaking, dysgeusia, and dysphagia, that significantly affect patients’ quality of life. A 2019 review of clinical trials1 compiled several strategies against RIX and reported that sialogogue medications, sparing parotid glands by intensity-modulated radiation therapy, and salivary gland transfer have been shown to be effective but at the cost of adverse events or persistent symptoms after treatment. A 2015 randomized clinical trial2 demonstrated that patients with RIX who received acupuncture-like transcutaneous electrical nerve stimulation had marginally better responses and significantly fewer adverse events compared with patients who received oral pilocarpine. This trial suggested that acupuncture may be a promising approach to prevent RIX.

In the study by Garcia et al,3 results of a 2-center, phase 3, randomized, sham-controlled clinical trial for the treatment and prevention of RIX with acupuncture are presented. Interestingly, one center was situated in the United States, and the other was in China. A classic 3-arm study design was used to compare true acupuncture (TA) and sham acupuncture (SA) with a standard care control (SCC). Compared with SCC, TA resulted in significantly lower xerostomia scores and lower incidence of clinically significant xerostomia 1 year after treatment, while the SA was not significantly associated with improved xerostomia scores. However, no significant difference between TA and SA xerostomia scores was observed, and both acupuncture groups combined showed significantly lower xerostomia scores compared with SCC. This phenomenon is often found in acupuncture trials and may be resolved by the increase of the overall sample sizes or, at least, by the disproportionate increase of the size of the TA group to detect differences between TA and SA.

One of the significant and exciting findings in the study by Garcia et al3 is the differences between the US and Chinese study sites. Among US patients, only the SA group showed a significantly better xerostomia score compared with the SCC group, while no differences were observed between the TA and SCC groups. In contrast, among Chinese patients, TA significantly improved the xerostomia scores compared with SA and SCC, while the SCC and SA had very similar efficacy. In other words, the Chinese study population clearly showed a hypothesis-confirming result, while the US study population seemed to have been more susceptible to SA. This finding coincides with the opposite tendency of the expectation scores during the course of the treatment: in the Chinese patients, confidence in the sham treatment decreased, while US patients built more confidence in the sham treatment through time. In China, most patients are well aware that without the de qi sensation, acupuncture treatment does not work. Acupuncture service has a very low price and is widely available in most community health care centers and hospitals in China.4 Therefore, Chinese acupuncturists have to have proficient needle manipulation skills to quickly elicit strong and long-lasting de qi sensations; otherwise, patients may switch to other acupuncturists. This may also explain the larger effect size of TA in the study site in China.

Usually in acupuncture trials, SA consists of using real acupuncture needles and inserting them superficially at non–acupuncture points (minimal acupuncture). In this study, SA consisted of a mixture of real and nonpenetrating placebo needles and a mixture of real and sham points. In addition, in the informed consent process, patients were told that 2 different acupuncture approaches would be used but that 1 approach might not target dry mouth symptoms. Although this aspect of the informed consent process was intended to maximize confidence in both acupuncture approaches, apparently in the Chinese setting characterized by a long cultural background of traditional Chinese medicine (TCM), SA was experienced differently compared with TA. In this setting, Chinese clinicians are deeply familiar with TCM and acupuncture. Therefore, they may have felt more irritated using the SA procedure, a suggestion that they may have carried over to their patients. In contrast in Western societies, TCM and acupuncture are much less deeply rooted, which likely resulted in more uncertainties on specific acupuncture treatments. Given the nature of SA, it might be a reasonable way to use the same acupoints as in TA but manipulate needles in a countertreatment manner. For example, if the treatment protocol requires “tonifying energy” in an acupoint, the SA could “sedate energy” at the same acupoint. However, this is unethical for acupuncturists, as they believe that such treatment would worsen the condition being treated.

Findings in the study by Garcia et al3 support the idea that acupuncture exerts its effects not only or not mainly by needle site activity and specific neurophysiological mechanisms but also by expectations, conditioning, and suggestibility of clinicians and patients.5 The effects of these unspecific factors may be quite large. Together with many other 3-arm acupuncture trials in Western countries, results of the study by Garcia et al3 has disclosed what is referred to in the literature as the efficacy paradox,6 that is, even though TA and SA were similarly effective, the size of overall effect of any acupuncture was superior to standard therapy.

In a previous randomized, single-blind, placebo-controlled, multifactorial, mixed-methods clinical trial on chronic pain, the personality of individual practitioners (not the empathic behavior) and patient’s beliefs about treatment veracity independently had significant effects on outcomes.7 However, patients and acupuncturists are embedded in a larger cultural context in which acupuncture appears to support the therapeutic ritual of the patient in a unique way and plays a crucial role in the therapeutic outcome of the patient. In support of this, recent research has shown that these complex, ritual-induced biochemical and cellular changes in a patient’s brain are very similar to those induced by drugs.8

With these ideas in clinical acupuncture trials in mind, the cultural background should increasingly move to the center of attention. What was predicted in a small interview among patients with back pain came true: “In China, outcomes of active acupuncture will be still better than the outcomes of sham acupuncture.”9



Garcia et al.'s work, reference 3:
Effect of True and Sham Acupuncture on Radiation-Induced Xerostomia Among Patients With Head and Neck Cancer: A Randomized Clinical Trial. M. Kay Garcia et al. JAMA Netw Open. 2019;2(12):e1916910. Dec 6 2019, doi:10.1001/jamanetworkopen.2019.16910
Key Points
Question  Can acupuncture prevent radiation-induced xerostomia, an adverse effect among patients with head and neck cancer undergoing radiation therapy?
Findings  In this randomized clinical trial with 339 participants, 12 months after the end of radiation therapy, the xerostomia score of the true acupuncture group was significantly lower than that of the standard care control group.
Meaning  These findings suggest that acupuncture should be considered for the prevention of radiation-induced xerostomia, but further studies are needed to confirm their clinical relevance and generalizability. 
Abstract
Importance  Radiation-induced xerostomia (RIX) is a common, often debilitating, adverse effect of radiation therapy among patients with head and neck cancer. Quality of life can be severely affected, and current treatments have limited benefit.
Objective  To determine if acupuncture can prevent RIX in patients with head and neck cancer undergoing radiation therapy.
Design, Setting, and Participants  This 2-center, phase 3, randomized clinical trial compared a standard care control (SCC) with true acupuncture (TA) and sham acupuncture (SA) among patients with oropharyngeal or nasopharyngeal carcinoma who were undergoing radiation therapy in comprehensive cancer centers in the United States and China. Patients were enrolled between December 16, 2011, and July 7, 2015. Final follow-up was August 15, 2016. Analyses were conducted February 1 through 28, 2019.
Intervention  Either TA or SA using a validated acupuncture placebo device was performed 3 times per week during a 6- to 7-week course of radiation therapy.
Main Outcomes and Measures  The primary end point was RIX, as determined by the Xerostomia Questionnaire in which a higher score indicates worse RIX, for combined institutions 1 year after radiation therapy ended. Secondary outcomes included incidence of clinically significant xerostomia (score >30), salivary flow, quality of life, salivary constituents, and role of baseline expectancy related to acupuncture on outcomes.
Results  Of 399 patients randomized, 339 were included in the final analysis (mean [SD] age, 51.3 [11.7] years; age range, 21-79 years; 258 [77.6%] men), including 112 patients in the TA group, 115 patients in the SA group, and 112 patients in the SCC group. For the primary aim, the adjusted least square mean (SD) xerostomia score in the TA group (26.6 [17.7]) was significantly lower than in the SCC group (34.8 [18.7]) (P = .001; effect size = −0.44) and marginally lower but not statistically significant different from the SA group (31.3 [18.6]) (P = .06; effect size = −0.26). Incidence of clinically significant xerostomia 1 year after radiation therapy ended followed a similar pattern, with 38 patients in the TA group (34.6%), 54 patients in the SA group (47.8%), and 60 patients in the SCC group (55.1%) experiencing clinically significant xerostomia (P = .009). Post hoc comparisons revealed a significant difference between the TA and SCC groups at both institutions, but TA was significantly different from SA only at Fudan University Cancer Center, Shanghai, China (estimated difference [SE]: TA vs SCC, −9.9 [2.5]; P < .001; SA vs SCC, −1.7 [2.5]; P = .50; TA vs SA, −8.2 [2.5]; P = .001), and SA was significantly different from SCC only at the University of Texas MD Anderson Cancer Center, Houston, Texas (estimated difference [SE]: TA vs SCC, −8.1 [3.4]; P = .016; SA vs SCC, −10.5 [3.3]; P = .002; TA vs SA, 2.4 [3.2]; P = .45).
Conclusions and Relevance  This randomized clinical trial found that TA resulted in significantly fewer and less severe RIX symptoms 1 year after treatment vs SCC. However, further studies are needed to confirm clinical relevance and generalizability of this finding and to evaluate inconsistencies in response to sham acupuncture between patients in the United States and China.

Saturday, December 7, 2019

High-achieving boys, to avoid bullying, use strategies to maintain an image of masculinity, for example becoming bullies themselves, disrupting the lessons, or devaluing girls’ achievements

Being bullied at school: the case of high-achieving boys. Sebastian Bergold et al. Social Psychology of Education, December 7 2019. https://link.springer.com/article/10.1007/s11218-019-09539-w

Abstract: Bullying victimization has been shown to negatively impact academic achievement. However, under certain circumstances, levels of academic achievement might also be a cause of bullying victimization. Previous research has shown that at least in Western countries, high school engagement is connoted by students as un-masculine. Therefore, high school engagement and achievement in school violate boys’, but not girls’, peer-group norm. This might put high-achieving boys at higher risk of bullying victimization as compared to high-achieving girls. The present study investigated boys’ and girls’ risk of bullying victimization, depending on different achievement levels. To this end, representative data of N = 3928 German fourth grade students were analyzed. Results showed that boys among the top-performers and also boys among the worst performers had a markedly higher risk of being bullied than girls showing the same achievement, whereas there were no such risk differences between genders in the average achievement groups. The relation between academic achievement and bullying victimization, features with regard to gender, and directions for future research are discussed.

Keywords: Bullying Peer victimization Academic achievement Gender differences Gender roles Elementary school

4 Discussion

Showing high engagement for school as a boy is against the peer-group norm
whereas doing so as a girl is not. Violating the peer-group norm, in turn, is 
sanctioned by the classmates. As school engagement is an important determinant of
academic achievement, we investigated whether boys showing exceptionally high
academic achievement would be at higher risk of bullying victimization than girls
with exceptionally high academic achievement. We drew on representative data of
fourth graders from the combined TIMSS and PIRLS 2011 assessments conducted
in Germany.

4.1 General relation between academic achievement and bullying victimization
In accordance with previous research on the relation between academic achievement
and bullying victimization (e.g., Nakamoto and Schwartz 2010), we found that there
was a negative relation between both variables in general. The higher the level of
academic achievement, the lower was self-reported bullying victimization. Bullying
victimization was lowest in the profle with students exhibiting the highest achievement 
level, which is in line with previous studies showing that high-performing or
gifted students in general are somewhat less often bullied than average students
(Estell et al. 2009; Peters and Bain 2011). This pattern is also in accordance with
studies investigating the social integration of gifted students (which, on average,
show markedly higher achievement than students with average ability; e.g., Rost
and Hanses 1997; Wirthwein et al. 2019). For example, gifted students in elementary 
school age as well as in adolescence were found to be well-integrated into their
classes: They seemed to be even somewhat more popular among their classmates
and somewhat less rejected than students with average ability (e.g., Czeschlik and
Rost 1995; Rost 2009).
Whereas this is an encouraging result for high-performing and gifted students as
a whole, it is a worrying finding for low-performing students. It became apparent
that the frequency of victimization was alarming for students in the profles with
low achievement: A quarter of the Profle 1 students reported being bullied weekly,
and another 45% of these students reported being bullied once or twice a month.
Altogether, this makes well over two-thirds of these students being victimized to a
non-trivial extent. Of course, due to the cross-sectional nature of our data, we cannot draw any conclusions about the causal direction of this relation. Drawing on
previous research, it can be regarded as certain that bullying victimization impedes
academic achievement, probably through many diferent pathways (e.g., Buhs et al.
2006; Juvonen et  al. 2000, 2011; Ladd et  al. 1997, 2017;  Schwartz et  al. 2005).
However, there might additionally be an efect in the other direction. Very poor
achievement might also predispose students to being victimized by classmates. This
would accord with Olweus’ (1978) assumption that not only students with extremely
high, but also students with extremely low achievement might be at higher risk of
being bullied. This would also be in line with efects found in vocational contexts,
according to which not only high, but also low performers are victimized more
often than average performers (Jensen et  al. 2014). Also the study by Ladd et  al.
(2017) (see Sect. 1) might point in this direction, because most of the diferent profles of victimization trajectories in this study had difered in academic achievement
from the outset. If this causal direction should indeed prove true anti-bullying programs should pay greater attention to low performance as a risk factor for bullying
victimization.

4.2 Bullying victimization by academic achievement and gender
Although there was a clear negative relation between bullying victimization and academic achievement when considering the entire sample regardless of gender, taking
gender into consideration provided more nuanced results. Consistent with our main
hypothesis, we found that in the profle of students with extremely high achievement, boys had a markedly higher risk of being bullied than girls: Boys’ risk of
being bullied weekly was more than twice as large as girls’; and boys’ risk of being
bullied once or twice a month was increased by over 40% as compared with girls’.
Importantly, this was not the case in the profles in the middle of the achievement
spectrum, showing that this finding was specifc to the group of extreme high (and
low, see below) achievers. Although we cannot draw causal conclusions from this
finding either, it is at least consistent with the hypothesis that highly engaged and
therefore high-achieving boys (but not girls) violate the peer-group norm by showing high academic achievement and are therefore more prone to victimization than
girls engaging in, and excelling at, school. Importantly, as the students excelling
in one domain (e.g., reading) and the students excelling in the other domains (e.g.,
mathematics) were the same individuals, this finding did not difer across domains
stereotypically denoted as “male” or “female”.
Of course, one might argue that it might not be achievement (or engagement)
itself, which increases high-achieving boys’ risk of victimization. Instead, highachieving boys could show other, specifc behaviors or attitudes that increase their
risk. This would be consistent with the finding that gifted boys, but not so much
gifted girls, are seen by their teachers as being more maladjusted (Preckel et  al.
2015), and maladjustment might easily make them victims of bullying (Eriksen
et al. 2014; Reijntjes et al. 2010, 2011; Schwartz et al. 1993). However, studies have
shown that those stereotypes do not match reality: Gifted students, whether they
may be boys or girls, do not show worse adjustment in any regard (e.g., Bergold
et al. 2015; Francis et al. 2016; Rost 2009). Therefore, this alternative explanation
appears unlikely.
Our finding has important practical implications: As a result of being bullied
because of their high engagement and achievement, boys might reduce their school
engagement and their academic achievement after having experienced victimization
in order to get themselves out of the fring line. Renold (2001) has also documented
further strategies of high-achieving boys to maintain masculinity, for example
becoming bullies themselves, disrupting the lessons, or devaluing girls’ achievements. 
All these avoidance strategies come at a price too high for both the individual
student(s) and society in the long run. To avoid these undesirable consequences,
several interventions could be implemented. One problem surely is that victimized
students—and especially boys—often do not seek help from others, for example
from their teacher or from their parents (e.g., Hunter et al. 2004). The psychological 
costs for help-seeking are often perceived as too high, comprising the fear of
(further) disapproval by the classmates (which is especially present in boys), feelings 
of own weakness, and feelings of a lack of autonomy (not being able to solve
the problem on one’s own) (Boulton et al. 2017). One possibility to help victimized
students (especially boys) would be to encourage them to confde in their teachers or
their parents. This can be helpful, yet the efect heavily relies on the adult’s reaction
and on the specifc situation (Bauman et  al. 2016). Especially for high-achieving
students, telling the teacher about victimization could sometimes be problematic
because some high-achieving students might already be perceived by their 
classmates as the “teacher’s pet” (Babad 1995; Tal and Babad 1990; Trusz 2017). Telling 
the teacher about bullying and disclosing the perpetrator(s) might then possibly
even worsen the situation. Therefore, intervention strategies could additionally start
at other points. One option would be to change the peer-group norm for boys. Interventions 
could aim at a masculinization of academic achievement and engagement
in school. For example, the learning strategy of memorizing new material is more
often used by girls than by boys (e.g., Artelt et al. 2010; Heyder and Kessels 2016).
However, Heyder and Kessels (2016) showed that labeling memorizing with a 
stereotypically masculine designation (“training consequently” vs. “memorizing diligently”)
increased boys’ choice of the memorizing strategy (whereas there was no
efect on girls’ choices). This could be a promising approach to make school 
engagement seem more acceptable to boys and, thereby, to destigmatize boys who show
high levels of school engagement, which could in turn decrease their victimization.
Likewise, high academic achievement might be made more acceptable to boys by
labeling it as a result of competition, which is perceived as a stereotypically male
domain (e.g., Niederle and Vesterlund 2011). However, it would be important here
to defne competition in an intra-individual sense rather than in an inter-individual
sense, since competition between classmates would likely trigger average students’
upward comparisons, making negative reactions to the high-achieving students possibly 
even more likely (Di Stasio et al. 2016; Festinger 1954). Rather, instruction 
should stimulate intra-individual comparisons, inspiring boys to compete 
with themselves to achieve better and better, with high(er) academic 
achievement as a kind of trophy finally gained.
Another interesting finding, which we had however not predicted, was that not
only high-, but also low-achieving boys showed a greater risk of bullying 
victimization than their female counterparts. Whereas the risk diference 
was well-explainable for the high-achieving 
students (violation of peer-group norm by showing high
engagement and achievement), it appears harder to explain it for the low-achieving
students, because displaying poor achievement (and engagement) is not inconsistent
with the male gender role. Maybe boys’ academic achievement sufers more from
victimization than girls’. Another explanation would be that especially 
low-achieving boys rather than low-achieving girls and average- or high-performing boys react
more aggressively to victimization (aggression and cognitive ability are negatively
related; e.g., Duran-Bonavila et al. 2017), which might in turn evoke negative reactions 
from the classmates, reinforce the perpetrator(s), and thus increase 
victimization further (Salmivalli et al. 1996; 
Sokol et al. 2015). However, as we cannot test
this hypothesis on the basis of our data, this could be a subject of future studies.

Data from reproductive suppression in humans support the argument that populations subjected to environments dangerous for children yield birth cohorts that exhibit great longevity

Reproductive suppression and longevity in human birth cohorts. Katherine B. Saxton  Alison Gemmill  Joan A. Casey  Holly Elser  Deborah Karasek  Ralph Catalano. American Journal of Human Biology, December 6 2019. https://doi.org/10.1002/ajhb.23353

Abstract
Objectives: Reproductive suppression refers to, among other phenomena, the termination of pregnancies in populations exposed to signals of death among young conspecifics. Extending the logic of reproduction suppression to humans has implications for health including that populations exposed to it should exhibit relatively great longevity. No research, however, has tested this prediction.

Methods: We apply time‐series methods to vital statistics from Sweden for the years 1751 through 1800 to test if birth cohorts exposed in utero to reproductive suppression exhibited lifespan different from expected. We use the odds of death among Swedes age 1 to 9 years to gauge exposure. As the dependent variable, we use cohort life expectancy. Our methods ensure autocorrelation cannot spuriously induce associations nor reduce the efficiency of our estimates.

Results: Our findings imply that reproductive suppression increased the lifespan of 24 annual birth cohorts by at least 1.3 years over the 50‐year test period, and that 12 of those cohorts exhibited increases of at least 1.7 years above expected.

Conclusions: The best available data in which to search for evidence of reproductive suppression in humans support the argument that populations subjected to environments dangerous for children yield birth cohorts that exhibit unexpectedly great longevity.


We found that 5‐year‐olds, but not 3‐year‐olds, cheated significantly more often if they overheard the classmate praised for being smart

Young Children are More Likely to Cheat After Overhearing that a Classmate is Smart. Li Zhao  Lulu Chen  Wenjin Sun  Brian J. Compton  Kang Lee  Gail D. Heyman. Developmental Science, December 6 2019. https://doi.org/10.1111/desc.12930

Abstract: Research on moral socialization has largely focused on the role of direct communication and has almost completely ignored a potentially rich source of social influence: evaluative comments that children overhear. We examined for the first time whether overheard comments can shape children's moral behavior. Three‐ and 5‐year‐old children (N = 200) participated in a guessing game in which they were instructed not to cheat by peeking. We randomly assigned children to a condition in which they overheard an experimenter tell another adult that a classmate who was no longer present is smart, or to a control condition in which the overheard conversation consisted of non‐social information. We found that 5‐year‐olds, but not 3‐year‐olds, cheated significantly more often if they overheard the classmate praised for being smart. These findings show that the effects of ability praise can spread far beyond the intended recipient to influence the behavior of children who are mere observers, and they suggest that overheard evaluative comments can be an important force in shaping moral development.

Discussion

We investigated the effects of overheard evaluative comments on young children’s moral behavior. After asking participants to promise not to cheat in a guessing game, we assessed the extent to which they would break this promise across two conditions: an overheard praise condition in which  children overheard that a classmate who was no longer present is smart, or a control condition in which they overheard comments that involved non-social information. We found that the effects of overhearing ability praise differed by age: 5-year-olds cheated significantly more frequently in response to overheard ability praise than to overheard non-social information, but the 3-year-olds’ cheating rate was not sensitive to this manipulation. These results extend prior findings (Zhao et al., 2017) by showing that, at least for 5-year-olds, ability praise can promote cheating without it being conveyed to children directly. It is noteworthy that Zhao et al. (2017) found direct ability praise promoted cheating even among 3-year-olds, with 62% of 3-year-olds and 58% of 5-year-olds engaging in cheating in response to direct ability praise, as compared to 40% and 68%, respectively, in the overheard praise condition of the present study.

Why might these contexts have a differential effect for 3-year-olds but not 5- year-olds? We believe this difference may be due to the information processing demands of overhearing a multi-party communication. In the present research, the overheard communication involved three other individuals (the two adults who were speaking, and the classmate who was being praised), as compared to one other individual (the experimenter) in the prior work on direct praise. One might expect this cognitive complexity to affect 5-year-olds as well, but this does not appear to be the case. This may be because by age 5, children have the cognitive capacity to be able to understand complex multi-party interactions, and because they have the relevant social experience to know that they can learn a great deal from overheard conversations about other people.

An alternative explanation is that 3-year-olds are only sensitive to information about their own abilities, and thus the developmental transition concerns gaining the ability to see the behavior of others as relevant to the self. This is plausible because the direct praise study differed from the overheard praise condition in the present study not only in the form the communication took (direct versus overheard), but also in the target of the praise (the participant versus another child). However, the preliminary results of an ongoing study we are conducting suggest that this target effect cannot account for this difference: we are finding that after overhearing that they themselves are smart, 3- year-olds are cheating at a level that is close to the 40% rate that was seen in the present study.

However, this does not rule out the possibility that processing information about others is inherently more complex than processing information about the self, and that it may add to the complexity of processing overheard information in third-party contexts. This possibility would be generally consistent with theories suggesting that children use the self as a starting point for social cognition (Meltzoff, 2007). Consequently, future studies will be needed to disentangle the effects of the type of communication, versus the target of the evaluative comments. Further research will also be needed to more fully understand the effect of overheard ability praise that was observed among 5-year-olds in the present study. As noted previously, overheard ability praise may elicit concerns with social comparison. It may also lead to the inference that the experimenter places a high value on being smart, or that being smart is highly valued more generally.

These possibilities could be explored by examining whether there are similar effects on cheating when concerns with social comparison are elicited in other ways, or when the social value of being smart is communicated in other ways. An additional finding from the present study was that among 5-year-olds, boys cheated more than girls, which is consistent with gender differences in dishonesty among adults (e.g., Alm et al., 2009; Bucciol et al., 2013; Tibbetts, 1999). However, it is somewhat surprising that no gender by condition interaction was found within either age group, given the three-way interaction observed for participants overall. This might be due to the fact that our sample size for this age group was not large enough to reveal a significant two-way interaction.

This possibility is supported by a power analysis based on the results of our three-way interaction for participants overall, which revealed that a required sample size of 107 would be needed to detect a significant interaction, just 7 participants more than the current sample size of 100 (However, we made similar power analyses based on the results of the condition and gender effects for 5-year-olds. Both analyses yielded a required sample size of 220, which is more than twice the current sample size of 100). Given that our sample size was predetermined on the basis of existing findings of condition differences, future research with larger sample sizes will be needed to look more closely at this issue. The present research significantly extends previous work on the effects of overheard conversations.

This prior work has primarily focused on how overheard interactions might promote children’s learning about language, objects, and emotions (e.g., Akhtar et al., 2001; Akhtar, 2005; Floor & Akhtar, 2010; Phillips et al., 2012; Repacholi & Meltzoff, 2007). Our work shows that overheard conversations can have unintended consequences for children’s moral behavior. Our findings also extend previous work on gossip (e.g., Eder & Enke, 1991; Gottman & Mettetal, 1986; Hill, 2007; Ingram & Bering, 2010), given that overheard ability praise can be considered a form of gossip, which is commonly defined as “the sharing of evaluative information about an absent third party” (e.g., Dunbar, 1996; for a review, see Foster, 2004). Previous work has suggested that it is not until about 8 years of age that children begin to use gossip to help them navigate social situations such as inferring social norms (e.g., Aikins, 2015; see also, Hill, 2007). The current findings suggest that even 5-year-olds have some capacity to use gossip in a similar way, and it raises questions about other ways in which young children might use gossip to make sense of the social world.

Future research will be needed to examine the effects of overhearing other forms of praise, such as praise for being honest. Another important topic to address will be the effects of overheard criticism, although addressing this question raises challenging ethical issues. The results of this research will help us to better understand the effects of overheard evaluative comments on children’s moral socialization. Our findings have broad practical implications for parents, teachers, and other caregivers. Given that evaluative comments such as ability praise are often made in public contexts, more attention should be paid to minimize the potential negative effects on children who may be listening.

In summary, the present research is the first to demonstrate that children as young as age 5 are more likely to engage in cheating after overhearing praise of another child for being smart. Our findings suggest that the negative implications of ability praise can spread outward, beyond the intended recipient, to affect the behavior of children who are mere observers. More broadly, our findings identify overheard evaluative information, a ubiquitous aspect of children’s social environment, as an important force in shaping moral development.

Tinder users prefer a potential partner whom they perceive to be similar in the personality traits agreeableness & openness to experience; no evidence for preferences for assortative mating based on attractiveness

Never mind I'll find someone like me – Assortative mating preferences on Tinder. Brecht Neyt, Stijn Baert, Sarah Vandenbulcke. Personality and Individual Differences, Volume 155, March 1 2020, 109739. https://doi.org/10.1016/j.paid.2019.109739

Abstract: Previous literature has identified assortative mating as the most frequent deviation from random mating both in offline dating and on classic online dating websites. However, several recent studies have suggested that assortative mating is fading due to the advent of mobile dating apps. Therefore, in this study we examine whether preferences for assortative mating are still present on the most popular mobile dating app of the moment, Tinder. For this means, we analyze experimental and survey data on 7846 Tinder profile evaluations. We unambiguously find that Tinder users prefer a potential partner whom they perceive to be similar in the personality traits agreeableness and openness to experience. With respect to similarity in perceived age, we find either no assortment or positive assortment, depending on whether we condition on other participant characteristics. Finally, we do not find any evidence for preferences for assortative mating based on attractiveness. We examine heterogeneous preferences by the gender and age of the experiment participants.

Keywords: Assortative matingPersonality traitsBig FiveDating appsTinder


4. Discussion

In this study we examined whether assortative mating preferences
that are often identified in offline dating and on classic online dating
websites are still present on the recently popular MDAs such as Tinder.
More specifically, we investigated whether Tinder users had a preference
for potential partners whom they perceived to be similar in age,
attractiveness, and Big Five personality traits. We examined this by
using experimental and survey data collected by Neyt et al. (2018).
In line with previous literature examining both offline dating and
dating on classic online dating websites, we found evidence for assortative
mating preferences based on age when controlling for other
participant characteristics. Given that in their literature review
Watson et al. (2004) point to age as one of the factors with the strongest
positive assortment, it is unsurprising that also in these analyses it was
the factor upon which assortative mating was the strongest. However,
correlation analyses showed no evidence for this sorting behavior. As a
consequence, results of analyses unconditional on other participant
characteristics are in line with recent studies on Tinder which found
that assortative mating preferences are fading on this dating platform
(Neyt et al., 2019; Ortega & Hergovich, 2017). Additionally, we found
that individuals prefer potential partners whom they perceive to be
similar on the Big Five personality traits agreeableness and openness to
experience (both in analyses controlling and not controlling for other
participant characteristics). This is in line with the studies of
Botwin et al. (1997) and Rammstedt and Schupp (2008), although these
studies also found assortative mating based on conscientiousness. Apparently,
even in a setting with no search frictions in which people
show interest in a potential partner a priori to meeting them, they prefer
a potential partner whom they perceive to be similar in the personality
traits agreeableness and openness to experience. The finding that assortative
mating based on perceived personality traits is lower than
assortative mating based on age, is in line with the review of the literature
in Watson et al. (2004) as well as with these authors’ own
findings.
With respect to similarity in age (when controlling for other participant
characteristics) and similarity in openness to experience, we also
found that these results are driven by the female participants and the
older participants. A suggestive explanation for this finding is that these
groups of participants have higher standards with respect to whom they
show interest in. For the female participants this would be in line with
the finding by Botwin et al. (1997) who showed that females express
more discriminating preferences for personality characteristics in their
ideal mate compared to males. This in turn is in line with parental investment
theory (Trivers, 1972) which argues that the sex that invests
more in offspring – for humans this is the females – are more discriminating
in their mate preferences. For the older participants this
higher discrimination in mate preferences could be because they are
looking for a more serious relationship.
Further, we did not find any evidence for assortative mating preferences
based on attractiveness. We argue that this is the case because
attractiveness is not a horizontal attribute upon which individuals mate
assortatively but rather a vertical attribute where there exists a predefined
consensus on which potential partners are the most desirable, in
casu highly attractive individuals. This behavior is likely reinforced by
the fact that showing interest in a person on Tinder is low in psychological
costs in case of rejection.
Finally, the finding that women have a preference for a potential
partner whom they perceive to be older whereas men do not exhibit age
preferences is in line with the findings of Kenrick and Keefe (1992).
Indeed, they report that while in early mating years – which most of our
participants are at, see Table 1 for descriptive statistics on participants’
age – men do not yet exhibit preferences for a younger potential
partner, women have a preference for an older potential partner already
in their early mating years.
We end this study by pointing out the main limitations of our research
design. First, we only examined mating behavior in the first
stage of the dating process. Nonetheless, we believe findings with respect
to this first stage are interesting, as it is a necessary stage each
individual trying to find a partner on a MDA needs to get through to
advance to the next stages of a relationship.
Second, although the experimental design was a very close reflection of
reality and although the data could not suffer from socially
desirable answers, it would still be interesting to verify whether partner
preferences identified in this study also hold in reality. We suggest
future studies – if possible – to use data directly provided by Tinder.
Third, with the data we used in this study we are only able to examine
whether certain assortative mating preferences exist and whether
they differ between certain groups of participants. However, we are
unable to deduce why these exist. We would suggest future studies to
examine – potentially using qualitative data – why exactly, for example,
female and older participants have higher preferences for potential
partners who are similar in openness to experience.
Fourth, in this study we made use of the TIPI to measure the Big Five
personality traits. Given that this scale measures each personality trait
with two questions, other – more extensive – scales are able to capture
personality more rigorously. However, given that we asked participants
to rate 16 profiles, using a more elaborate scale was not appropriate in
this study as results would suffer too much from bias due to boredom.
Still, we encourage future studies to use a more extensive scale of the
Big Five personality traits to examine assortative mating based on
personality in dating.
Next, in this study we only examined individuals from Western
countries (supra, Section 2). However, it would be interesting to
examine whether preferences for assortative mating also differ between
cultures, e.g. between Western and non-Western individuals. While
Buss (1989) conducted an analysis on absolute mating preferences over
37 cultures, to the extent of our knowledge no such analysis has been
done on assortative mating preferences.
Finally, in this study we only examined assortative mating preferences
based on age, attractiveness, and personality. Naturally, individuals
could have preferences for similarity on many more characteristics such as
ethnicity, socioeconomic status, and education level
to name a few. As assortative mating based on these characteristics
would have substantial societal consequences, we encourage future
research to examine sorting behavior with respect to these
characteristics on the recently popular MDAs such as Tinder.


Sex with another person, with an orgasm, was perceived to have a relatively stronger effect on men compared to women in terms of sleep quality; activity without orgasm was, to men, sleep-impairing

A national survey on how sexual activity is perceived to be associated with sleep. Ståle Pallesen et al. Sleep and Biological Rhythms, December 3 2019. https://link.springer.com/article/10.1007/s41105-019-00246-9

Abstract: There is a paucity of studies investigating how sexual activity is perceived to influence sleep, despite conceptions about significant gender differences regarding this issue. In all, 4000 persons, aged between 18 and 55 years, were randomly drawn from the Norwegian Population Registry and invited to participate in a postal survey. The respondents were asked how sexual activity with another person, with or without orgasm, and how masturbation, with and without orgasm, influenced sleep latency and sleep quality. A total of 1080 persons participated (response rate 28.2%) of which 56.1% were women. The mean age of the sample was 38.7 years (SD = 10.8). Sexual activity with an orgasm was perceived to have a soporific effect by both men and women. Sexual activity with another person, with an orgasm, was perceived to have a relatively stronger effect on men compared to women in terms of sleep quality. Sexual activity without an orgasm was by men reported to have a sleep impairing effect, whereas the perceived effect reported by women was equivocal. Sexual activity with orgasms was perceived as having a soporific effect in both men and women. Sexual activity without an orgasm had an equivocal perceived effect on sleep.

Keywords: Gender differences Orgasm Sexual activity Sleep onset latency Sleep quality Soporific effect


Discussion

The mean self-reported habitual sleep onset latency
reported by the sample was somewhat longer than normal
for young adults, albeit within normal range for middle
and older adults for both men and women [15]. Generally,
sexual activity with orgasm was perceived to shorten sleep
latency as well as improve sleep quality in both men and
women. This is in line with previous notions that orgasm
has soporific effects [6, 7, 9], and supports as such our
first hypothesis stating that orgasms following sexual
activity generally will be perceived to have a soporific
effect, albeit larger for men than for women. The exact
mechanism behind the soporific effect of orgasms is not
clear, but it may be attributable to the release of neurohormones
such as oxytocin, prolactin, and endorphins that are
assumed to have relaxing properties [16–18]. The effect
seemed to be larger for men than for women, especially
concerning orgasm following sexual activity with another
person. The positive perceived effect on sleep of sex with
an orgasm was also reported by the only previous survey
on this topic, but in the previous survey no gender differences
were found [9]. The gender difference regarding the
perceived soporific effect of masturbation with orgasms
was, however, not significant, a finding in line with the
aforementioned survey [9].
The difference score (effect of masturbation with an
orgasm—effect of sexual activity with another person with
an orgasm) was negative for men, but neutral for women in
terms of sleep latency. Thus, regarding sleep latency, men
perceived a greater soporific effect of sexual activity with
another person, with an orgasm, compared to masturbation
with an orgasm, whereas no significant difference was
reported by women. The difference score in terms of sleep
quality was negative for both men and women. Still, it was
significantly larger for men, implying that men, compared
to women, seem to experience greater soporific effect of
sexual activity with another person, with an orgasm, compared
to masturbation with an orgasm. Hence, sex with
another person, with an orgasm, had a stronger perceived
soporific effect for men than women (both for sleep latency
and sleep quality) compared to masturbation with an orgasm.
This lends support to our second hypothesis (masturbation
followed by orgasms relative to orgasms following sex with
another person will be perceived to have relatively stronger
soporific effect for women compared to men). One possible
explanation to this finding is that men, according to some
studies, have a higher energy expenditure during intercourse
than women [19], which may promote sleep [20]. However,
not all studies have shown that men spend relatively more
energy during sexual activity than women [21], and since
sexual activity often is of relatively short duration [22]
potential differences in gender expenditure during sexual
activity is not likely to explain gender differences concerning
the perceived soporific effects of sexual activity on sleep.
Another explanation to these gender differences is that men
have a stronger and more biologically and genitally sexual
drive, whereas women’s sexual drive to a larger extent is
romantically driven with a higher emphasis on intimacy [10,
11]. This view seems congruent with models of sexual selection
which posit that males invest less in the offspring, have
a higher reproductive rate and benefit more from mating
multiply than women [23].
Hence, when having sex with another person women generally
may put more emphasis on the relationship, whereas
men may put more emphasis on sexual gratification [10, 11,
23]. This may contribute to men easier falling asleep after
sexual activity with another person ending in orgasm, compared
to women, as men at this point may have obtained their
goal, whereas women still may want emotional intimacy or
confirmations about the relationship. It is also known that
most men following orgasms have a refractory period where
they cannot experience further erection or orgasms, whereas
women’s postorgasmic genital arousal is more variable [24]
which may influence the soporific effects of sexual activity
differently across genders.
According to our third hypothesis, sexual activity without
orgasm was expected to have no influence on sleep.
However, men actually reported longer sleep onset latency
and poorer sleep quality both when sexual activity with
another person and masturbation did not provide orgasm.
For women, this was only the case for sleep latency following
sexual activity with another person without orgasm.
Women reported no effect on sleep onset latency following
masturbation without orgasm, and no effect on sleep quality
when sexual activity with another person or masturbation
did not end in orgasm. Taken together, these findings show
that men seem to be negatively affected by sexual activity
without an orgasm, whereas women appeared to respond
less and more neutral to this. In this regard the present findings
are not in line with findings reported by the previous
survey by Lastella and colleagues, where it was suggested
that sexual activity, whether or not ending in orgasms, had a
perceived soporific effect. However, it seems that the questions
used in the previous survey were somewhat blurred
in terms of absence of orgasms explicitly [9], which might
explain the discrepancy in results. The third hypothesis was
thus not supported for men, and only partly supported for
women. Overall, it seems that lack of orgasm following
sexual activity is reported to be more frustrating for men
than for women, leading to perceived poorer sleep for men
compared to women. This may again reflect different emphasis
on behalf of men (e.g. sexual) compared to women (e.g.
intimacy) when it comes to sexual activity. It is also known
that sexual encounters more often end in orgasms for men
compared to women [25], hence lack of orgasm may thus
be more frustrating and sleep impairing for the male gender.

Limitations and strengths

The response rate of the present study was low, despite the
fact that the questionnaire was short and up till two reminders
were sent and material reinforcement (gift card lottery)
was used. However, the low response rate can probably be
explained by the sensitive (sexual) topic being investigated
[26]. It should be noted that low response rates do not imply
that results are invalid [27]. Still, we acknowledge that the
findings should be replicated in future studies. Although
similar to those used in a recent survey [9], the questions
about sexual activity’s perceived effect on sleep were constructed
for the purpose of the present study, hence their
psychometric properties are unknown. This is a limitation
and future research efforts should be taken to establish items
for this topic, for example by the method of Delphi [28].
Questions about sexual behaviors are sensitive by nature,
hence it cannot be ruled out that some did not answer truthfully.
However, care was taken to inform about how the data
would be registered and confidentiality ensured. In addition,
self-completion questionnaires were used as this seems to
result in more valid reports than interviews [29].
It should be noted that the questions were quite general
(sexual activity with another person or masturbation), and
future studies on this topic should therefore differentiate better
between different sexual behaviors (e.g. sex with a new
vs. familiar partner) and also assess their duration to investigate
how sleep is affected by them. In some of the analyses
the number of respondents was lower than the total sample,
as those answering “not relevant” were left out of the analysis.
The effect of sexual behavior on sleep was evaluated
retrospectively, which may render the responses vulnerable
to recall bias, thus the use of diaries in future studies on
this topic is encouraged [30]. It should also be noted that
only two sleep outcomes were evaluated (sleep onset latency
and sleep quality), as these were regarded most sensitive
to potential soporific effects of sex. Still, future studies
should include a wider array of sleep variables as outcomes
[31]. The present study was based on subjective rating of
sleep only, hence the findings should be corroborated by
objective sleep measures in the future. As orgasms may be
described along several dimensions, and since there may
be some gender differences in this regard [32], this should
be taken into consideration in future studies on this topic.
The present study did not differentiate between phases of
the menstrual cycle for the female respondents, although
this may influence both sleep [33] and sexual behavior [34].
Hence, future studies should take this into account. Prospect
research should in addition aim at identifying variables
beyond gender that might explain variance in the soporific
effects of sexual activity.
In terms of strengths, it should be noted that the present
study is one of the first large surveys that has addressed the
soporific effect of sexual behavior on sleep and contributes
as such with novel findings on a topic that is often debated
and heavily surrounded by myths. The sample was drawn
from the Norwegian Population Registry, which increases
the generalizability of the present findings. The sample was
weighted by the discrepancy between the general population
and sample characteristics in terms of age and gender, and
thus corrected for different response rates among subgroups.

Australia: Disabled men were at least twice as likely to be attracted to females & males, not experience sexual attraction, identify as bisexual or homosexual & have female & male sexual partners

Does sexual orientation vary between disabled and non-disabled men? Findings from a population-based study of men in Australia. Anne-Marie Bollier et al. Disability & Society, Dec 3 2019. https://doi.org/10.1080/09687599.2019.1689925

Abstract: Some research suggests that disabled people are more likely to be sexual minorities than non-disabled people, but this evidence comes mainly from younger or older populations. We used data from a large survey of Australian men aged 18–55 to examine the relationship between disability and minority sexual orientations. Results from our statistical analyses suggest that a larger proportion of disabled than non-disabled men are sexual minorities. Our estimates showed that disabled men were at least twice as likely as non-disabled men to be attracted to females and males, not experience sexual attraction, identify as bisexual, identify as homosexual and have female and male sexual partners—relative to the likelihood of female-only attraction, heterosexual identity and female-only sexual partners. Findings provide new information about sexual diversity in disabled versus non-disabled Australian men, which can help inform inclusive service provision and identify avenues for future research about sexual minority disabled people.

Keywords: disability, men, sexual orientation, sexual minority, sexual identity, sexual attraction


Distribution of Facial Resemblance in Romantic Couples Suggests Both Positive and Negative Assortative Processes Influence Human Mate Choice

Holzleitner, Iris J., Kieran J. O'Shea, Vanessa Fasolt, Anthony J. Lee, Lisa M. DeBruine, and Benedict C. Jones. 2019. “Distribution of Facial Resemblance in Romantic Couples Suggests Both Positive and Negative Assortative Processes Influence Human Mate Choice.” PsyArXiv. December 5. doi:10.31234/osf.io/pw5c

Abstract: Previous research suggests that humans show positive assortative mating, i.e. tend to pair up with partners that are similar to themselves in a range of traits, including facial appearance. Facial appearance can function as a cue to genetic similarity and plays a critical role in human mate choice. Evidence for positive assortative mating for facial appearance has largely come from studies showing people can match pictures of couples’ faces at levels greater than chance and that facial photographs of couples are rated to look more similar than those of non-couples. However, interpreting results from matching studies as evidence of positive assortative mating for facial appearance is problematic, since this measure of perceived compatibility does not necessarily reflect actual physical similarity, and may be orthogonal to, or even negatively correlated with, physical similarity. Even if participants are asked to rate facial similarity directly, it remains unclear which, if any, face shape cues contribute to an increased perception of similarity in romantic couples. Here we use a shape-based assessment of facial similarity to show that the median similarity of long-term couples’ face shapes is only slightly greater than that of an age-matched control sample. Moreover, this was driven by the most similar 40% of couples, while the most dissimilar 20% of couples actually showed disassortative mating for face shape when compared to the control sample. These data show that a simple measure of central tendency obscures variability in the extent to which couples display assortative or disassortative mating for face shape. By contrast, a more fine-grained analysis that considers the distribution of variation across couples in the extent to which they resemble each other suggests that both positive and negative assortative processes influence human mate choice.

Dissimilarity data and analysis code are at available at https://osf.io/m9f54

Excerpts:

The extent to which romantic couples physically resemble each other is a long-standing question with implications for influential theories of mate choice, such as optimal outbreeding theory22. Optimal outbreeding theory acknowledges that mating with closely-related individuals can have a large negative effect on reproductive fitness (i.e., results in less viable offspring), but emphasizes that excessive outbreeding (mating with highly genetically dissimilar individuals), too, can have a negative effect on reproductive fitness23,24. Consequently, while folk psychology theories predict that romantic couples will physically resemble each other, optimal outbreeding theory predicts that both assortative and disassortative processes may influence human mate choice. Several studies have demonstrated that perceptions of facial similarity are very highly correlated with (i.e. nearly indistinguishable from) perceptions of genetic relatedness, demonstrating that facial similarity can function as a cue of genetic relatedness25,26. Moreover, facial appearance is known to play a critical role in social interaction, including romantic partner choice14,27,28. Consequently, much of the research on the extent to which romantic couples physically resemble each other has investigated facial similarity between romantic partners. While several studies have reported that the faces of romantic partners can be matched at levels greater than chance15-19, such results do not necessarily indicate that romantic couples physically resemble each other. For example, matching of romantic couples at levels greater than chance could occur simply because people similar in physical attractiveness are judged more likely to be in a romantic relationship with each other than people who differ in their physical attractiveness29 (but see30). Moreover, the physical traits associated with attractiveness in men and women are not identical and, in some cases, even opposite. For example, feminine facial features are attractive in women, while masculine facial features are attractive in men (although the extent to which this is the case is disputed31-35). This first important limitation of previous work can be avoided entirely by using nonperceptual measures of facial resemblance. One approach for objectively defining and comparing face shape is to assess the position a face occupies in ‘face space’. Face space is a multi-dimensional space representing the global face shape dimensions derived from Principal Component Analysis of shape coordinate. Within this multi-dimensional face space, similarity can be quantified as the Euclidean distance between individual faces (see36 for a recent review). A second important limitation of previous work on this topic is that it has used measures of 4 central tendency to investigate the extent to which couples on average resemble each other. Focusing exclusively on measures of central tendency can, however, obscure important variation in the data37,38. This variation is likely to be particularly important in the context of research motivated by optimal outbreeding theory, since optimal outbreeding theory explicitly predicts that both assortative and disassortative processes will influence mate choice. In light of the above, we first used distance in face space to objectively assess the degree of similarity between romantic couples in face shape and compared these scores with controls. We then sought to establish whether there are systematic differences among couples in the extent to which they resembled each other. First, we calculated shape-dissimilarity scores for 3D scans of 178 couples’ faces. Shapedissimilarity scores were the Euclidean distance in a multidimensional face space derived from ten-fold cross-validated PCA of 3D face-shape coordinates. In order to create a control distribution, we identified all possible pairings between each woman and all men in the set who were within five years of her actual partner’s age. The median number of control pairings per woman was 71. We then calculated the dissimilarity score for each control pairing. The median control dissimilarity score for each woman was calculated and is referred to hereon as the control dissimilarity score. Figure 1A shows the distributions of couple and control dissimilarity scores.

[Figure 1. (A) Dissimilarity score distributions of couples (target women + actual partner, N=178) and controls (target women + median of age-matched controls, N=178). Scores were centered on median control dissimilarity (dashed line). Median couple dissimilarity was marginally smaller than
median control similarity. (B) Difference strip chart showing the difference scores of similarity between each woman and her actual partner/her median control. The horizontal lines mark the deciles, with the thicker line marking the median. (C) The shift function shows the difference of couples – control for each decile (y-axis) as a function of couple deciles (x-axis). For each decile difference, the vertical line indicates the 95% bootstrap confidence interval (1000 samples).]

Couple and control dissimilarity scores were initially compared using a paired-samples bootstrapping technique. The median difference score between couple and control dissimilarity was significantly lower than 0 (estimate=−71, p=.040; Figure 1B), suggesting couples are slightly less dissimilar than chance. However, this analysis of central tendency ignores more fine-grained information about the full distribution. Therefore, we next separated couples into deciles based on their dissimilarity scores. Within each decile, we then compared the couple and control dissimilarity scores and plotted this difference at each decile (Figure 1C). If distributions of couple and control dissimilarity scores were identical, one would expect to see a flat line around 0 for all deciles. If distributions were merely shifted to the left or right, the shift function would show a flat line below or above 0. Figure 1C shows that couple dissimilarity was significantly lower than control scores in the first four deciles (i.e., the most similar 40% of couples) and significantly greater than control scores in the last two deciles (i.e., the most dissimilar 20% of couples). Thus, while the most similar 40% of couples show assortative mating for face shape, the most dissimilar 20% of couples show disassortative mating for face shape. This underlines the limitation of a simple central-tendency comparison of similarity when testing for assortative or disassortative mating. Analysis of a measure of central tendency showed the type of assortative mating predicted by folk psychology and reported in some previous research. However, the effect was weak. By contrast, analyzing resemblance between couples using deciles, which allows for a far more fine-grained analysis of the distribution of resemblance across couples, showed clear evidence of both assortative and disassortative processes in human mate choice. This finding suggests that individuals may differ in the costs and benefits of assortative vs disassortative mating. Future research could investigate predictors of such individual differences. Not only does the pattern of results found here support an explicit prediction from optimal outbreeding theory (that both assortative and disassortative processes will influence human mate choice), it also highlights the pervasive problem of relying on analyses of measures of central tendency when studying complex behaviors.

Often we say we are Bayesian reasoners; people instead reason in a digital manner, assuming that uncertain information is either true or false when using that information to make further inferences

Johnson, S. G. B., Merchant, T., & Keil, F. C. (2019). Belief digitization: Do we treat uncertainty as probabilities or as bits? Journal of Experimental Psychology: General. Dec 2019. https://doi.org/10.1037/xge0000720

Abstract: Humans are often characterized as Bayesian reasoners. Here, we question the core Bayesian assumption that probabilities reflect degrees of belief. Across eight studies, we find that people instead reason in a digital manner, assuming that uncertain information is either true or false when using that information to make further inferences. Participants learned about 2 hypotheses, both consistent with some information but one more plausible than the other. Although people explicitly acknowledged that the less-plausible hypothesis had positive probability, they ignored this hypothesis when using the hypotheses to make predictions. This was true across several ways of manipulating plausibility (simplicity, evidence fit, explicit probabilities) and a diverse array of task variations. Taken together, the evidence suggests that digitization occurs in prediction because it circumvents processing bottlenecks surrounding people’s ability to simulate outcomes in hypothetical worlds. These findings have implications for philosophy of science and for the organization of the mind.

General Discussion

Do beliefs come in degrees? Here, we showed that they do not when we use those beliefs to make
further predictions—in such cases, probabilities are converted from an ‘analog’ to a ‘digital’ format
and are treated as either true or false. Compared to Bayesian norms, participants across our studies
consistently underweighted low-probability relative to high-probability hypotheses, often ignoring
low-probability events completely. This neglect challenges theories of cognition that posit a central
role to graded probabilistic reasoning. Here, we discuss where this tendency appears to come from
and in what ways it might be limited.

Predictions from Uncertain Beliefs
Many studies have found that when an object’s category is uncertain, people rely on the single
most-probable category when predicting its other features. Although some studies find individual
differences and variability among tasks, single-category use has held up among many different kinds
of categorization schemes (e.g., Johnson, Kim, & Keil, 2016; Lagnado & Shanks, 2003; Malt, Murphy, & Ross, 1995; Murphy & Ross, 1994, 1999).
Plausibly, these limitations on probabilistic reasoning are specific to category-based induction
tasks. The purpose of categories, after all, is to simplify the world and carve it into discrete chunks.
But another possibility is that these previous findings are due to a much broader tendency in our
reasoning about uncertain hypotheses and their implications. A categorization of an object is a
hypothesis about what kind of object it is, but similarly a causal explanation is a hypothesis about what led something to happen and a mental-state inference is a hypothesis about what someone is thinking. The current studies find that people only think in terms of one hypothesis at a time in a causal reasoning task, suggesting that such digital thinking is a broad feature of hypothetical thinking. This is consistent with the singularity hypothesis (Evans, 2007), according to which people entertain only a single possibility at a time—an idea with broad explanatory power in higher-level cognition.
Why does digitization occur when making predictions from uncertain beliefs? Such predictions
typically require three processes. First, potential hypotheses must be evaluated, given the available
evidence, resulting in estimates of the hypothesis probabilities P(A) and P(B) (abduction). Second, the prediction needs to be made conditionally on each hypothesis holding, that is, in each relevant possible world, resulting in estimates of the predictive probabilities P(Z|A) and P(Z|B) (simulation). Finally, these conditional predictions need to be weighted by the plausibility of each hypothesis (integration), leading to an estimate of P(Z). Although people are able to perform each of these processes, they each are accompanied by limitations and bias. How do each of these stages contribute to digitization? Our experiments are most consistent with a model in which abduction leads to more extreme explicit hypothesis probabilities, simulation capacity limits result in digitization, and integration leads people to under-use hypothesis probabilities relative to predictive probabilities. This conclusion is necessarily provisional at this early stage, but here we lay out the best case made by the evidence.
The abduction phase—deciding among potential hypotheses as the best explanation for the
data—relies on a variety of heuristics. Although many of these heuristics may adaptively help to
circumvent computational limits or even lead to more accurate inferences, these heuristics lead to
systematic biases relative to Bayesian norms. Most relevant, people assign a higher probability to a
hypothesis that outperforms its competitors, relative to what is implied by objective probabilities
(Douven & Schupbach, 2015; see also Lipton, 2004). This sort of process could plausibly give rise to
digitization. Moreover, explanation often leads to overgeneralization in the face of exceptions
(Williams et al., 2013), consistent with the idea that abduction tends to underweight or ignore lower-
probability hypotheses. But in our studies, abduction does not seem to be a necessary ingredient for
digitization, since digitization even occurs when hypothesis probabilities P(A) and P(B) are provided
explicitly, avoiding the need for abductive processing (Studies 5 and 8A). The most likely resolution
of this puzzle is that abduction leads us to explicitly assign more extreme probabilities to hypotheses,
relative to Bayesian norms, but not to ignore those less-likely hypotheses altogether.
The simulation phase—imagining the plausibility of the prediction in the possible worlds defined
by each hypothesis—is known to have sharp capacity limits (Hegarty, 2004). Indeed, even within a
simulation of a single causal system, people imagine each step in that system piecemeal. Thus, it seems unlikely that people can simultaneously simulate multiple possible worlds and store their outputs simultaneously. Consistent with the idea that this is the key processing bottleneck that produces digitization, people do consider multiple possibilities when the predictive probabilities P(Z|A) and P(Z|B) are given explicitly, avoiding the need to simulate these outcomes (Studies 8B and 8C).
Yet, this does not seem to be the whole story. The integration phase—putting together multiple
pieces of evidence and weighing each by their diagnosticity—is also subject to biases. In particular,
people tend to over-rely on information about evidence strength (e.g., the proportion of cases consistent with a hypothesis) relative to information about evidence weight (e.g., sample size) (Griffin & Tversky, 1992; Kvam & Pleskac, 2016). Although this bias should not be extreme enough to lead people to
ignore lower-probability hypotheses, it could result in overconfidence—overly extreme probabilities—if people treat predictive probabilities as strength information (how likely the prediction is within each possible world) and hypothesis probabilities as weight information (how much to consider each
possible world). This pattern seems to be consistent with the data. Even when both the hypothesis
and predictive probabilities are given explicitly, requiring only integration to occur, participants overrely on the high-probability relative to low-probability hypothesis (Study 8C).
Thus, all three processing steps appear to contribute to overly extreme probability judgments,
albeit in different ways. Abduction may result in explicit probabilities that are too extreme, relative to
Bayesian norms. Integration seems to result in under-responsiveness to hypothesis probabilities. And
simulation seems to lead people to ignore lower-probability hypotheses entirely.
If digitization can lead to systematic errors, relative to Bayesian norms, why might the mind use
this principle? Digitization is often necessary to avoid a combinatorial explosion (Bobrow, 2012;
Friedman & Lockwood, 2016). Suppose you are unsure whether the Fed will raise interest rates.
Depending on this decision, Congress may attempt fiscal stimulus; depending on Congress’s decision, the CEO of Citigroup may decrease capital reserves; and depending on the CEO’s decision, SEC regulators may tighten enforcement of certain rules. Integrating across such chains of possibilities becomes daunting even for a computer as the number of branches increases. As recently as the 1990s, chess-playing computers used brute force methods to search through trees of possible moves, and even the famous Deep Blue, despite its massive processing power, did not consistently defeat the best human players, such as Garry Kasparov (Deep Blue lost 2.5 out of 6 games in their final match). The computationally efficient way to approach such a problem is precisely the opposite of brute force—to construct plausible scenarios and ignore the rest. Human chess players had, and probably still have, far better heuristics for pruning this huge space of possibilities. Our participants’ error was using this strategy even when the normative calculation is straightforward. This strategy may be adaptive in other contexts. Indeed, when the most-likely hypothesis has a probability close to 100%, it may even be areasonable approximation to the Bayesian solution.
What, then, should we make of probabilistic theories of cognition (Gershman et al., 2015;
Tenenbaum et al., 2011)? People clearly can represent analog probabilities at some level (“a 70%
chance of rain”) but our results show that they cannot use these probabilities to make downstream
predictions, instead digitizing them. Because probabilistic models typically characterize the output of
reasoning processes rather than the underlying mechanisms, they can be of great value in
characterizing the problems that our minds solve. But to the extent that such theories make
mechanistic claims involving the processing of analog probabilities within complex computations—
even at an implicit level—simpler, heuristic mechanisms may better account for human successes,
such as they are, with uncertainty. We look forward to the possibility that computational approaches
to the kinds of tasks we model in this paper can help to shed further insight on the underlying cognitive processing.