Sunday, December 29, 2019

From 2013... Of the 7,000 languages spoken today, some 2,500 are generally considered endangered; this vastly underestimates the danger, less than 5% of all languages can still ascend to the digital realm

Kornai A (2013) Digital Language Death. PLoS ONE 8(10): e77056. https://doi.org/10.1371/journal.pone.0077056

Abstract: Of the approximately 7,000 languages spoken today, some 2,500 are generally considered endangered. Here we argue that this consensus figure vastly underestimates the danger of digital language death, in that less than 5% of all languages can still ascend to the digital realm. We present evidence of a massive die-off caused by the digital divide.

Conclusions
We have machine classified the world’s languages as digitally ascending (including all vital, thriving, and borderline cases) or not, and concluded, optimistically, that the former class is at best 5% of the latter. Broken down to individual languages and language groups the situation is quite complex and does not lend itself to a straightforward summary. In our subjective estimate, no more than a third of the incubator languages will make the transition to the digital age. As the example of the erstwhile Klingon wikipedia (now hosted on Wikia) shows, a group of enthusiasts can do wonders, but it cannot create a genuine community. The wikipedia language policy, https://meta.wikimedia.org/wiki/Language_proposal_policy, demanding that “at least five active users must edit that language regularly before a test project will be considered successful” can hardly be more lenient, but the actual bar is much higher. Wikipedia is a good place for digitally-minded speakers to congregate, but the natural outcome of these efforts is a heritage project, not a live community.

A community of wikipedia editors that work together to anchor to the web the culture carried by the language is a necessary but insufficient condition of true survival. By definition, digital ascent requires use in a broad variety of digital contexts. This is not to deny the value of heritage preservation, for the importance of such projects can hardly be overstated, but language survival in the digital age is essentially closed off to local language varieties whose speakers have at the time of the Industrial Revolution already ceded both prestige and core areas of functionality to the leading standard koinés, the varieties we call, without qualification, French, German, and Italian today.

A typical example is Piedmontese, still spoken by some 2–3 m people in the Torino region, and even recognized as having official status by the regional administration of Piedmont, but without any significant digital presence. More closed communities perhaps have a better chance: Faroese, with less than 50 k speakers, but with a high quality wikipedia, could be an example. There are glimmers of hope, for example [2] reported 40,000 downloads for a smartphone app to learn West Flemish dialect words and expressions, but on the whole, the chances of digital survival for those languages that participate in widespread bilingualism with a thriving alternative, in particular the chances of any minority language of the British Isles, are rather slim.

In rare cases, such as that of Kurdish, we may see the emergence of a digital koiné in a situation where today separate Northern (Kurmanji), Central (Sorani), and Southern (Kermanshahi) versions are maintained (the latter as an incubator). But there is no royal road to the digital age. While our study is synchronic only, the diachronic path to literacy and digital literacy is well understood: it takes a Caxton, or at any rate a significant publishing infrastructure, to enforce a standard, and it takes many years of formal education and a concentrated effort on the part of the community to train computational linguists who can develop the necessary tools, from transliterators (such as already powering the Chinese wikipedia) to spellcheckers and machine translation for their language. Perhaps the most remarkable example of this is Basque, which enjoys the benefits of a far-sighted EU language policy, but such success stories are hardly, if at all, relevant to economically more blighted regions with greater language diversity.

The machine translation services offered by Google are an increasingly important driver of cross-language communication. As expected, the first several releases stayed entirely in the thriving zone, and to this day all language pairs are across vital and thriving languages, with the exception of French – Haitian Creole. Were it not for the special attention DARPA, one of the main sponsors of machine translation, devoted to Haitian Creole, it is dubious we would have any MT aimed at this language. There is no reason whatsoever to suppose the Haitian government would have, or even could have, sponsored a similar effort [32]. Be it as it may, Google Translate for any language pair currently likes to have gigaword corpora in the source and target languages and about a million words of parallel text. For vital languages this is not a hard barrier to cross. We can generally put together a gigaword corpus just by crawling the web, and the standardly translated texts form a solid basis for putting together a parallel corpus [33]. But for borderline languages this is a real problem, because online material is so thinly spread over the web that we need techniques specifically designed to find it [16], and even these techniques yield only a drop in the bucket: instead of the gigaword monolingual corpora that we would need, the average language has only a few thousand words in the Crúbadán crawl. To make matters worse, the results of this crawl are not available to the public for fear of copyright infringement, yet in the digital age what cannot be downloaded does not exist.

The digital situation is far worse than the consensus figure of 2,500 to 3,000 endangered languages would suggest. Even the most pessimistic survey [34] assumed that as many as 600 languages, 10% of the population, were safe, but reports from the field increasingly contradict this. For British Columbia, [35] writes:

Here in BC, for example, the prospect of the survival of the native languages is nil for all of the languages other than Slave and Cree, which are somewhat more viable because they are still being learned by children in a few remote communities outside of BC. The native-language-as-second-language programs are so bad that I have NEVER encountered a child who has acquired any sort of functional command (and I don’t mean fluency - I mean even simple conversational ability or the ability to read and understand a fairly simple paragraph or non-ritual bit of conversation) through such a program. I have said this publicly on several occasions, at meetings of native language teachers and so forth, and have never been contradicted. Even if these programs were greatly improved, we know, from e.g. the results of French instruction, to which oodles of resources are devoted, that we could not expect to produce speakers sufficiently fluent to marry each other, make babies, and bring them up speaking the languages. It is perfectly clear that the only hope of revitalizing these languages is true immersion, but there are only two such programs in the province and there is little prospect of any more. The upshot is that the only reasonable policy is: (a) to document the languages thoroughly, both for scientific purposes and in the hope that perhaps, at some future time, conditions will have changed and if the communities are still interested, they can perhaps be revived then; (b) to focus school programs on the written language as vehicle of culture, like Latin, Hebrew, Sanskrit, etc. and on language appreciation. Nonetheless, there is no systematic program of documentation and instructional efforts are aimed almost entirely at conversation.

Cree, with a population of 117,400 (2006), actually has a wikipedia at http://cr.wikipedia.org but the real ratio is only 0.02, suggestive of a hobbyist project rather than a true community, an impression further supported by the fact that the Cree wikipedia has gathered less than 60 articles in the past six years. Slave (3,500 speakers in 2006) is not even in the incubator stage. This is to be compared to the over 30 languages listed by the Summer Institute of Linguistics for BC. In reality, there are currently less than 250 digitally ascending languages worldwide, and about half of the borderline cases are like Moroccan Arabic (ary), low prestige spoken dialects of major languages whose signs of vitality really originate with the high prestige acrolect. This suggests that in the long run no more than a third of the borderline cases will become vital. One group of languages that is particularly hard hit are the 120+ signed languages currently in use. Aside from American Sign Language, which is slowly but steadily acquiring digital dictionary data and search algorithms [36], it is perhaps the emerging International Sign [37] that has the best chances of survival.

There could be another 20 spoken languages still in the wikipedia incubator stage or even before that stage that may make it, but every one of these will be an uphill struggle. Of the 7,000 languages still alive, perhaps 2,500 will survive, in the classical sense, for another century. With only 250 digital survivors, all others must inevitably drift towards digital heritage status (Nynorsk) or digital extinction (Mandinka). This makes language preservation projects such as http://www.endangeredlanguages.com even more important. To quote from [6]:

Each language reflects a unique world-view and culture complex, mirroring the manner in which a speech community has resolved its problems in dealing with the world, and has formulated its thinking, its system of philosophy and understanding of the world around it. In this, each language is the means of expression of the intangible cultural heritage of people, and it remains a reflection of this culture for some time even after the culture which underlies it decays and crumbles, often under the impact of an intrusive, powerful, usually metropolitan, different culture. However, with the death and disappearance of such a language, an irreplaceable unit in our knowledge and understanding of human thought and world-view is lost forever.

Unfortunately, at a practical level heritage projects (including wikipedia incubators) are haphazard, with no systematic programs of documentation. Resources are often squandered, both in the EU and outside, on feel-good revitalization efforts that make no sense in light of the preexisting functional loss and economic incentives that work against language diversity [38].

Evidently, what we are witnessing is not just a massive die-off of the world’s languages, it is the final act of the Neolithic Revolution, with the urban agriculturalists moving on to a different, digital plane of existence, leaving the hunter-gatherers and nomad pastoralists behind. As an example, consider Komi, with two wikipedias corresponding to the two main varieties (Permyak, 94,000 speakers and Zyrian, 293,000 speakers), both with alarmingly low () real ratios. Given that both varieties have several dialects, some already extinct and some clearly still, the best hope is for a koiné to emerge around the dialect of the main city, Syktyvkar. Once the orthography is standardized, the university (where the main language of education is Russian) can in principle turn out computational linguists ready to create a spellchecker, an essential first step toward digital literacy [39]. But the results will benefit the koiné speakers, and the low prestige rural Zyrian dialects are likely to be left behind.

What must be kept in mind is that the scenario described for Komi is optimistic. There are several hundred thousand speakers, still amounting to about a quarter of the local population. There is a university. There are strong economic incentives (oil, timber) to develop the region further. But for the 95% of the world’s languages where one or more of these drivers are missing, there is very little hope of crossing the digital divide.

Errors in choice tasks are not only detected fast and reliably, participants often report that they knew that an error occurred already before a response was produced

Are errors detected before they occur? Early error sensations revealed by metacognitive judgments on the timing of error awareness. Francesco Di Gregorio, Martin E. Maier, Marco Steinhauser. Consciousness and Cognition. Volume 77, January 2020, 102857. https://doi.org/10.1016/j.concog.2019.102857

Highlights
• Humans frequently report that they detected errors already before executing the error response.
• Early error sensations occur consistently across tasks and metacognitive measures.
• Early error sensations are not caused by an expectation bias.

Abstract: Errors in choice tasks are not only detected fast and reliably, participants often report that they knew that an error occurred already before a response was produced. These early error sensations stand in contrast with evidence suggesting that the earliest neural correlates of error awareness emerge around 300 ms after erroneous responses. The present study aimed to investigate whether anecdotal evidence for early error sensations can be corroborated in a controlled study in which participants provide metacognitive judgments on the subjective timing of error awareness. In Experiment 1, participants had to report whether they became aware of their errors before or after the response. In Experiment 2, we measured confidence in these metacognitive judgments. Our data show that participants report early error sensations with high confidence in the majority of error trials across paradigms and experiments. These results provide first evidence for early error sensations, informing theories of error awareness.

Keywords: Error awarenessError detectionMetacognition


4. General discussion

Participants in experiments on error detection frequently report that they already knew that an error has occurred before the response was executed, a phenomenon we term early error sensation. The goal of the present study was to investigate whether these anecdotally reported early error sensations exist and whether they can be reliably reported. In four experiments using two experimental approaches, we provided evidence that early error sensations indeed exist, and that they occur on the majority of error trials. When participants were asked to classify responses in a flanker task either as being correct, as early detected errors, or as late detected errors in Experiment 1a, they reported early errors in 73.7% of errors. When an additional category for detected errors with unclear timing was introduced in Experiment 1b, early errors were reported in 59.1% of trials. When participants had to wager on the feeling of early error detection, they placed high bets on 62.4% (Exp. 2a) and 70.9% (Exp. 2b). These data demonstrate that early error sensations are reported very consistently across different primary tasks (flanker task vs. number/let discrimination) and secondary tasks (error classification vs. post-decision wagering).
Crucial, however, is the question whether these introspective reports indeed reflect that errors were detected before the response, or whether participants were unable to discriminate between early and late errors and simply guessed that early errors must occasionally occur. A challenging problem for measuring early error sensations is that we cannot objectively determine whether a given error was detected early or late. To deal with this problem, we introduced a reference for the metacognitive reports of early error sensations. In Experiment 2, we used a Visual Awareness task in which participants had to wager on the accuracy of their responses. In the subsequent Error Awareness task, we instructed participants to place high bets on early error sensations only if they were similarly confident as for the high bets in the Visual Awareness task. We argued that this induces a common metric for judging confidence of the two tasks, which allowed us to interpret the metacognitive reports of early error detection with respect to the metacognitive judgments of visual awareness. This reasoning receives support from previous findings showing that humans represent confidence in a task-unspecific format which allows them to compare confidence across tasks with a similarly high precision as confidence within tasks (de Gardelle & Mamassian, 2014). Moreover, it has recently been suggested that integrating information from different sources into a common metric might even be the major purpose of metacognition (Shea & Frith, 2019). In Experiment 2a, the frequencies of high bets were coincidentally similar in both tasks. We can thus infer that the average confidence by which participants reported early error sensations in this experiment corresponded to the average confidence by which they were aware of the visual stimuli in the Visual Awareness task. This confidence level ought to be rather high given that the objective performance in the Visual Awareness task was far above chance level.
We found no evidence that metacognitive reports of early error sensations were subject to an expectation bias. If participants simply guessed that early error sensations must occasionally occur, these guesses should be influenced by expectations about the frequency of early error sensations. To investigate whether such an expectation bias exists, we manipulated the difficulty of the Visual Awareness task, and thus the frequency of high bets in this task. However, whereas the frequency of high bets in the Visual Awareness task varied between Experiments 2a and 2b, the frequency of high bets in the Error Awareness task remained constant across the two experiments. This suggests that metacognitive judgments about early error sensations are not influenced by a specific expectation bias induced by the frequency of high bets in the Visual Awareness task. While we cannot fully exclude a general bias towards instruction-driven expectations about early error sensations, our results strongly suggest that metacognitive judgments on early error sensations are very consistent and reliable across experimental procedures.
We found no evidence that early and late detected errors differ with respect to any objective features. It has been reported that uncertainty or conflict during response selection can influence post-response decision process and metacognitive judgments about errors (Steinhauser et al., 2008Yeung and Summerfield, 2012). As a consequence, variables like stimulus congruency or RT could potentially influence subjective judgments about early error sensations. However, we found no robust evidence that this was the case in the present study. Participants reported early error sensations in a similar proportion for congruent and incongruent errors in Experiment 1. Moreover, RTs were similar across all error types. A small RT difference between early and late detected errors in Experiment 1a disappeared when we controlled for errors with unclear timing in Experiment 1b. This suggests that the emergence of early error sensations is not related to specific features of task processing like stimulus congruency or RTs. Thus, our data provide little evidence that early error sensations reflect the objective latency of error detection, which has been found to correlate with RT when response speed was directly manipulated (Steinhauser et al., 2008).
An important question is why early error sensations occurred on the majority of trials whereas the neural correlates of error awareness emerge not until 300 ms after an error (e.g., Steinhauser & Yeung, 2010). There are at least two possible explanations. A first explanation is that conclusions about the timing of error awareness from EEG measures like the Pe are incorrect. The Pe is often considered the earliest neural correlate of error awareness and the role of the Pe for the emergence of error awareness has been described within an evidence accumulation account (Steinhauser and Yeung, 2010Ullsperger et al., 2010). It is assumed that the Pe reflects the accumulated evidence that an error has occurred, and that error awareness emerges when this evidence exceeds a threshold. The evidence is provided by cognitive, autonomous, motor and sensory processing (Bode and Stahl, 2014Wessel et al., 2012Wessel et al., 2011), but does not necessarily rely on early error processing represented by the Ne/ERN (Di Gregorio et al., 2018). One possibility is that the feeling of error awareness emerges already before the Pe, for instance, at the time point of the Ne/ERN or even earlier (Bode & Stahl, 2014). The Pe could represent a later stage of metacognitive processing, perhaps related to the emergence of confidence about response accuracy (Boldt & Yeung, 2015).
A second explanation is that early error sensations are a metacognitive illusion. Error awareness could emerge at the time of the Pe but the illusion is created that the error has been detected already before the response. This mechanism could serve to subjectively synchronize error awareness with the timing of the objective error in the same way as visual awareness is subjectively aligned with the onset of a visual stimulus. In the context of visual awareness, expectations and other top-down variables can influence the accumulation of sensory evidence and consequentially metacognitive judgments about stimulus awareness (de Lange et al., 2010Kouider et al., 2010). Moreover, a backward referral process has been assumed to synchronize the subjective time point of visual awareness with the objective stimulus to create a coherent perception in the stream of consciousness (Libet et al., 1979Libet et al., 1983). A similar process could align the subjective time point of error awareness with the emergence of the objective error. This temporal alignment of actions (i.e., a response) and their effects (i.e., the feeling of being incorrect) could further serve to evoke a sense of agency, i.e., the feeling of having caused an effect. Indeed, previous studies have shown that action-effect contingencies are influenced by their temporal contiguity and vice versa. Humans tend to perceive two events more causally related the closer they occur in time (Greville & Buehner, 2010), and causality judgments correlate with the perceived temporal contiguity between actions and their sensory effects (Haering & Kiesel, 2016). In other words, these metacognitive illusions on early error sensations could serve to reconstruct temporal contiguity between perception, action and metacognitive contents (Kouider et al., 2010).
While we obtained clear and robust results across several experiments, the present method has also some limitations. A first limitation is that using a categorical measure for the timing of error detection implies a loss of information as time is a continuous phenomenon. However, differentiating only between errors detected before and after the response has the advantage of imposing considerably lower cognitive load than using a continuous measure. For instance, in the classical Libet studies (Libet et al., 1983), participants had to indicate the time of voluntary action initiation on a visual clock. However, in addition to considerable methodological weaknesses (Trevena & Miller, 2002), monitoring a clock represents a difficult secondary task that presumably interferes with both, the primary task and the task to detect errors. In contrast, our categorical measure uses the response as a reference rather than a continuous timer. As error detection already involves response monitoring (Steinhauser et al., 2008), only minimal additional load should be imposed.
As already discussed, a second limitation is that we have no objective measure that verifies the existence of early error sensations. Future studies could solve this problem by measuring neural correlates of early error sensations. Strong evidence for the existence of early error sensations would be provided if not only the Pe but also the earlier Ne/ERN would correlate with early error sensations. If only the Pe differed between early and late detected errors, this would suggest that early error sensations emerge during the later stage of conscious error processing. However, if such a difference was found also for the Ne/ERN, this would point to early error signals such as response conflict (Yeung et al., 2004) or prediction errors (Holroyd & Coles, 2002) as the origin of early error sensations. It is even possible that brain activity preceding the response can affect metacognitive judgments on early error sensations. ERP differences between errors and correct responses have been found prior to the response (Bode & Stahl, 2014) or even on the previous trial of simple tasks (Hajcak et al., 2005Hoonakker et al., 2016Ridderinkhof et al., 2003), as well as in tasks involving complex sequences of motor programs such as piano playing (Maidhof, Rieger, Prinz, & Kloesh, 2009) . In a similar vein, a study using self-report measures has revealed that internal error prediction occurs before responses in skilled typing (Rieger & Bart, 2016). Here, the question arises whether this activity serves as a cue for metacognitive judgments, or whether metacognition relies on direct access to the timing of these neural events.
A further question is whether early error sensations are related to early incorrect response activation. On correct trials, early incorrect response activation leads to a phenomenon called partial errors (Burle et al., 2002Coles et al., 1995Endrass et al., 2008), which can be consciously reported by participants (Rochet, Spieser, Casini, Hasbroucq, & Burle, 2014). Future studies could investigate whether such early incorrect response activation on error trials is responsible for early error sensations. Indeed, lower response force for errors than correct responses has been shown in skilled typing (Rabbitt, 1978). As this phenomenon has been interpreted as resulting from inhibition of the error response before actual response execution, it could be taken as indirect evidence for early error sensations. Future studies could examine whether errors accompanied by early error sensations are executed with lower response force than late errors.
The present study provides first evidence that participants have the subjective feeling of detecting errors already before they occurred. We show that these early error sensations can be robustly measured across different tasks and metacognitive judgments. Our results add to the broad body of evidence that humans have metacognitive access to a multitude of performance parameters. Previous studies could show that participants are able to report whether an error has occurred or not (Rabbitt, 1968Rabbitt, 2002), to provide graded confidence judgments on the accuracy of their response (Boldt & Yeung, 2015), to classify the type of error they committed (i.e., to which distractor stimulus they responded; Di Gregorio et al., 2016), and to estimate their RTs in choice tasks (Bryce & Bratzke, 2014). These metacognitive contents are used for optimizing decision processes (Desender et al., 2018Desender et al., 2014). Metacognitive representations on the timing of error detection could form another piece of information to support this optimization.

Saturday, December 28, 2019

What Arouses Evangelicals? Cultural Schemas, Interpretive Prisms, and Evangelicals’ Divergent Collective Responses to Pornography and Masturbation

What Arouses Evangelicals? Cultural Schemas, Interpretive Prisms, and Evangelicals’ Divergent Collective Responses to Pornography and Masturbation. Samuel L Perry. Journal of the American Academy of Religion, Volume 87, Issue 3, Sept 2019, Pages 693–724, https://doi.org/10.1093/jaarel/lfz024

Abstract: This study elucidates the puzzle of evangelical grievance selection by comparing evangelicals’ divergent collective responses to pornography use and solo-masturbation. Drawing on eighty in-depth interviews and content analyses of fifty-five evangelical monographs, I show how internal and external influences shape evangelicals’ evaluations of and responses to the two issues. Internally, evangelical cultural schemas of biblicism and pietistic idealism necessitate that grievances be connected directly to the Bible and believers’ “hearts.” Pornography is more aptly linked to explicit biblical proscriptions against heart-lust and consequently perceived collectively as a moral threat, compared with masturbation, which is neither directly addressed in the Bible nor unambiguously connected to lust. Externally, the growing influence of psychology within evangelicalism heightened concern about pornography’s harms while debunking myths associating masturbation with mental illness. These cultural influences provide “interpretive prisms” through which evangelicals differentially perceive the two issues, resulting in fervent anti-pornography activism and relative ambivalence toward masturbation.

He is a Stud, She is a Slut! Meta-Analysis on the Continued Existence of Sexual Double Standards for decades - but the effect is small

He is a Stud, She is a Slut! A Meta-Analysis on the Continued Existence of Sexual Double Standards. Joyce J. Endendijk, Anneloes L. van Baar, Maja Deković. Personality and Social Psychology Review, December 27, 2019. https://doi.org/10.1177/1088868319891310

Abstract: (Hetero)sexual double standards (SDS) entail that different sexual behaviors are appropriate for men and women. This meta-analysis (k = 99; N = 123,343) tested predictions of evolutionary and biosocial theories regarding the existence of SDS in social cognitions. Databases were searched for studies examining attitudes or stereotypes regarding the sexual behaviors of men versus women. Studies assessing differences in evaluations, or expectations, of men’s and women’s sexual behavior yielded evidence for traditional SDS (d = 0.25). For men, frequent sexual activity was more expected, and evaluated more positively, than for women. Studies using Likert-type-scale questionnaires did not yield evidence of SDS (combined M = −0.09). Effects were moderated by level of gender equality in the country in which the study was conducted, SDS-operationalization (attitudes vs. stereotypes), questionnaire type, and sexual behavior type. Results are consistent with a hybrid model incorporating both evolutionary and sociocultural factors contributing to SDS.

Keywords: sexual double standards, meta-analysis, gender, sexuality, social cognitions


In line with evolutionary theory (Buss & Schmitt, 1993Trivers, 1972) and biosocial theory (Wood & Eagly, 20022012), this meta-analysis demonstrated clear evidence for traditional SDS in studies assessing differences in people’s evaluation, or expectation, of men’s and women’s sexual behavior, although the effect was small. People expected behaviors associated with high sexual activity more from men than from women, and behaviors associated with low sexual activity more from women than from men. Similarly, people evaluated highly sexually active men more positively (or less negatively) than highly sexually active women, and low sexually active women more positively (or less negatively) than low sexually active men. In contrast, the overall set of studies using Likert-type-scale questionnaires for assessing SDS did not yield evidence of SDS.
We found some significant moderator effects in one or both sets of studies. First, existence of traditional SDS was behavior specific. Second, stereotypes about SDS were more traditional than attitudes about SDS. Third, studies using the “sexual double standard scale” (SDSS; Muehlenhard & Quackenbush, 1998) reported more traditional SDS than studies using the “double standard scale” (DSS; Caron et al., 1993) which demonstrated reversed SDS. Fourth, higher levels of gender equality in a country were associated with less traditional SDS. Participant gender and age, publication year, and study design were not significant moderators.

Behavioral Specificity of SDS

Regarding sexual behavior types, we found strongest evidence of SDS for being a victim of sexual coercion, followed by casual sex, and having an early sexual debut. SDS were less evident for sexual infidelity, level of sexual activity, other/mixed sexual behavior types, premarital sex, and being a perpetrator of sexual coercion. The findings for coercion and sexual encounters within a power or age hierarchy were partly in line with the predictions from biosocial theory that SDS would be most prevalent in sexual encounters where there is a power/status difference between men and women (Wood & Eagly, 20022012). However, we only found double standards for victims of sexual coercion, and not for perpetrators. That we did not find differences in the evaluation of male and female perpetrators might be because both male and female perpetrators violate gender role expectations, with men violating chivalry norms, and females violating communal characteristics. Thus, male and female perpetrators might have been evaluated equally negative for their gender-role inconsistent behavior. Moreover, the double standards for victims of sexual coercion, found in person perception studies, indicate that sexual behavior within the context of a power hierarchy is evaluated more negatively (or less positive) for female victims (e.g., more condemned, more perceived damage to reputation) than for male victims (e.g., “positive” experience that will be evaluated by peers as cool; Zaikman & Marks, 2017). Thus, girls might be blamed for being a victim of sexual coercion (Weis, 2009), whereas boys’ experiences of sexual coercion might be trivialized (Weis, 2010). This is inconsistent with the idea that male victims of sexual coercion or rape might be perceived as powerless and not willing to have sex, which violates men’s (hetero)sexual agentic gender role (Weis, 2010). Because only a few studies examined the evaluation of both perpetrator and victim, replication of the existence and direction of SDS for coercion victims is necessary in future studies.
The finding that engaging in casual sex and having an early sexual debut were more expected and rewarded in men than in women, fits partly with predictions from evolutionary theory. In terms of reproductive fitness, men would benefit more than women from having casual sex and by having sex at an early age (Buss & Schmitt, 1993Petersen & Hyde, 2010). However, similar beneficial effects would have been expected for sexual infidelity and high sexual activity with numerous partners, but for those sexual behaviors less traditional SDS were applied. The same was true for other sexual behaviors, such as premarital sex when engaged or when in love. Our findings are in line with previous narrative reviews concluding that premarital sex in particular has become accepted for both men and women (Bordini & Sperb, 2013Crawford & Popp, 2003).

Cross-Cultural Differences in SDS

In line with predictions from biosocial theory (Wood & Eagly, 20022012), and not with evolutionary theory’s perspective of obligate sex differences, SDS were less traditional in countries with higher levels of gender equality. According to biosocial theory, in cultures with bigger differences in the gender roles of men and women, men have more power than women, which translates in traditional SDS (Wood & Eagly, 20022012). However, level of gender equality was only a significant moderator in the meta-analysis conducted on studies using Likert-type-scale questionnaires, and not in the meta-analysis on differential evaluation and expectation of the sexual behavior of men and women. This might be because there was less variation in level of gender equality in the latter meta-analysis, as most studies in that meta-analyses were conducted in the United States. The direction of effect, albeit nonsignificant, was in the same direction as the effect from the meta-analysis on Likert-type scales.

Changes in SDS Over Time

In line with evolutionary theory, and not with biosocial theory, time period in which the study was conducted was no longer significant when controlling for other moderators, which indicated that traditional SDS have existed for decades and are still present. This finding could indicate that stable gender differences in reproductive strategies are underlying SDS. Also, it appears that even though gender roles have become less strict in most modern Western societies (Eagly & Wood, 1999), this did not lead to less differentiation in the norms for the sexual behavior of men and women (Wood & Eagly, 20022012). Possibly, it takes more time for egalitarian gender roles to permeate into the bedroom, than in other domains of life such as the work field, because sexuality is very much a private issue. Furthermore, the content of SDS may have changed over time, because most older studies focused on double standards in premarital sex in different relationship types, whereas newer studies more often focused on double standards in casual sex. Thus, changes in gender roles over time might only be reflected in changes in the behavior specificity of SDS.

Gender Differences in SDS

Regarding gender, we did not find differences between men and women in their cognitions about SDS. In light of male control theory and female control theory (Baumeister & Twenge, 2002), these findings could indicate that both male control and female control contribute equally to the existence of SDS. This means that SDS might provide evolutionary and sociocultural advantages for both genders that they would like to control. Advantages for men that arise from SDS could be improved certainty about paternity (Buss, 1994), patriarchal power over women, prevention of sexual chaos, and reduced male insecurity (Hyde & DeLamater, 1997). The advantages of SDS for women are the high value of sexual favors that they can trade for lower valued favors from men, such as economic provision, monogamous relationships, and parental investment.

Age Differences in SDS

Regarding participants’ age, we did not find support for the predictions of the gender-intensification hypothesis (Hill & Lynch, 1983). It appears that adolescence is not necessarily a period that is characterized by increased gender role pressure and intensification of people’s social cognitions about gender. However, it should be mentioned that most studies were conducted with high-educated college samples, mostly including emerging adults. It may be possible that the relatively small number of studies conducted with adolescents and adults, decreased the power to detect effects of age on SDS.

Implications for Evolutionary Theory and Biosocial Theory

In sum, some of the above findings are in line with evolutionary theory (Buss & Schmitt, 1993Trivers, 1972) whereas others are in line with biosocial theory (Wood & Eagly, 20022012). This converges with the findings of a recent theory-based narrative review, which demonstrated some support for predictions of both evolutionary theory and biosocial theory about the behavioral specificity of SDS, and for predictions of biosocial theory about cultural differences in SDS (Zaikman & Marks, 2017). Each theory suggests a different mechanism that underlies SDS, but these mechanisms might be intertwined. We therefore propose that a hybrid model explaining SDS from the interplay between biological predispositions and sociocultural pressures is most appropriate (Lippa, 2009). According to biosocial theory, different norms for the behavior of men and women may have arisen from societies’ division in gender roles that expects men to be assertive, dominant, and powerful, and women to be submissive, caring, and kind (Wood & Eagly, 20022012). However, the division in gender roles may have a biological or evolutionary origin (Wood & Eagly, 2012), because there are gender differences in adaptive reproductive strategies leading people to view (sexual) behaviors in men and women differently. Also, the predictive power of evolutionary and sociocultural gender role pressures to explain SDS appears to depend on the sexual behavior or context under consideration. Gender roles may have more predictive power in a sexual context characterized by power/status differences. Yet, evolutionary processes might play a larger role in sexual behaviors that increase successful reproduction.

Conceptualization and Measurement of SDS

We also looked at the effects of moderators related to conceptualization and measurement of SDS. Regarding SDS conceptualization, effect sizes were significant for both stereotypes and personal attitudes. This suggests that both stereotyped beliefs about the sexual behavior of men and women and people’s personal attitudes in response to sexual behavior that violates expectancies are underlying SDS. Yet, traditional SDS were more prevalent in collective or personal expectations about the sexual behavior of men and women (i.e., stereotypes) than in people’s personal evaluation of the sexual behavior of men and women (i.e., attitudes). This finding is in line with the idea that people can have knowledge of collectively shared stereotypes with regard to SDS or personal stereotypical expectations about the sexual behavior of men and women, although they do not apply these stereotypes personally when evaluating other people’s sexual behavior (Milhausen & Herold, 2001Signorella et al., 1993). It has been argued that knowledge of collective stereotypes is strong, stable, and does not depend on one’s experience with other people, but on culturally shared and generalized social beliefs (López-Sáez & Lisbona, 2009). Indeed, research in children as well as adults showed that content of collective gender stereotypes has not changed over time, whereas gender attitudes did become more egalitarian (e.g., Ruble, 1983Signorella et al., 1993).
However, our findings with regard to social cognition type need to be interpreted with caution, because the vast majority of studies examined personal SDS attitudes or a mix of stereotypes and attitudes. In studies examining a combination of stereotypes and attitudes, evidence for a reversed double standard was found, a finding that is difficult to disentangle because of the muddled operationalizations of SDS in these studies. Furthermore, in the small number of studies examining stereotypes, it was not possible to distinguish between descriptive and prescriptive aspects, or between personal stereotypes and knowledge of collective stereotypes. Yet, these distinctions are important for future research. For example, knowledge of collectively shared stereotypes is less predictive of one’s own behavior toward men and women than personal stereotypes (Stangor & Schaller, 1996). Furthermore, prescriptive stereotypes (e.g., perceptions of how men and women should behave sexually) might be particularly relevant in the context of SDS as they have been associated with negative evaluations and backlash for people who behave in stereotype-inconsistent ways (Burgess & Borgida, 1999). Indeed, gender stereotypes in general are highly prescriptive in nature (Prentice & Carranza, 2002) and more predictive of people’s personal evaluation of men and women (i.e., attitudes) than descriptive stereotypes (Gill, 2004).
As expected from dual-process models of social cognition (Gawronski & Creighton, 2013), studies using explicit Likert-type-scale questionnaires did not yield evidence for traditional SDS. Yet, studies using more implicit within- or between-subjects designs did yield evidence for SDS. The Likert-type-scale questionnaires often include items such as “It’s worse for a woman to sleep around than it is for a man” in which male and female sexual behavior is explicitly contrasted to each other (Muehlenhard & Quackenbush, 1998). Therefore, in studies using such questionnaires it might have been more clear to participants that personal cognitions about SDS were assessed, leading to social-desirable responding (Greenwald et al., 2009). In between- and within-subjects designs, the focus on SDS is more implicit than in explicit self-report questionnaires. This is because in a between-subject design researchers assessed cognitions about women’s and men’s sexual behavior with separate items or vignettes that they randomly assign to participants, who are generally unaware of the presence of other vignettes presented to other participants. Or in a within-subject design researchers administered separate vignettes or items about women’s and men’s sexual behavior in a counter-balanced way to participants (Jonason & Marks, 2009Reid et al., 2011Weaver et al., 2013). Thus, this finding suggests that traditional SDS might only be present at a more implicit level. Previous research indeed showed that implicit assessments are less prone to social-desirable responding (Gawronski & Bodenhausen, 2006) and more likely to suggest existence of traditional gendered cognitions (Endendijk et al., 2013).
However, SDS were not different between studies using between- or within-subjects designs, or between studies using extensive vignettes/scenarios versus studies using questionnaires with different items about the sexual behavior of men and women. This indicates that social desirability and demand characteristics might not necessarily play a larger role in within-subject research on SDS than in between-subject research (Marks & Fraley, 2005Milhausen & Herold, 2001). Also, this finding suggests that study designs that have only a slightly less explicit focus on SDS (i.e., not contrasting male and female sexual behavior in the same items) can yield evidence for the existence of traditional SDS. This argument is consistent with one study that specifically examined differences in implicit (i.e., under divided attention) and explicit (i.e., under full attention) SDS-cognitions, showing that traditional SDS were only present at an implicit level (Marks, 2008). However, between-subjects designs, like the study by Marks (2008), have been criticized for measuring single standards (because there is no comparison with how an individual would rate another target) instead of double standards (i.e., contrasting evaluation of male vs. female target; Crawford & Popp, 2003). Therefore, using IATs might be a fruitful direction to take to examine SDS at an implicit within-subjects level (see, for example, Sakaluk & Milhausen, 2012).
Our findings regarding questionnaire type indicated that questionnaires differ in the extent to which they yield evidence for SDS, which might also explain the nonequivalent findings in studies using these methods. Studies using the DSS (Caron et al., 1993) reported reversed double standards, whereas studies using the SDSS (Muehlenhard & Quackenbush, 1998) reported more traditional double standards, which might be explained by differences in content and scoring of the questionnaires. In the DSS all but one items are formulated in the direction of a traditional double standard (e.g., “It is up to the man to initiate sex.”) and participants answer the items on a scale ranging from strongly agree to strongly disagree. Such a questionnaire design cannot distinguish between people with reversed and egalitarian sexual standards, because both groups of people will (strongly) disagree with the traditional items. Therefore, we cannot be completely sure that the negative combined mean found in studies using the DSS actually reflects reversed double standards, or an egalitarian view about the sexual behavior of men and women instead. In contrast, the SDSS consists of 20 items occurring in pairs, with parallel items about men’s and women’s sexual behavior (e.g., “A [girl/boy] who has sex on the first date is easy”). In addition, six items contrast men’s and women’s sexual behavior, with some items formulated in the direction of traditional SDS (e.g., “A man should be more sexually experienced than his wife.”) and others formulated in an egalitarian way (e.g., “A woman’s having casual sex is just as acceptable to me as a man’s having casual sex.”). Participants answer all items on a scale ranging from disagree strongly to agree strongly. Difference scores are computed between the 10 male and female items and the six individual item-scores are added to these difference scores. The design of the SDSS makes it possible to assess a more complete range of reversed to traditional double standards than with the DSS. However, the SDSS score range is asymmetrical (−30 to 48). Thus, the more traditional double standards appearing in studies using the SDSS might have been an artifact of the possible range of scores.

Limitations and Future Directions

Some limitations of this meta-analytic study need to be addressed. First, the available body of quantitative research on SDS is highly homogeneous in terms of participant age, ethnicity, and educational level. According to biosocial theory, these factors are important in the social construction of gender roles, and more specifically for the social construction of SDS (Wood & Eagly, 20022012). Therefore, future studies should examine SDS in more diverse samples in terms of ethnicity, age, and educational level.
Second, almost all studies included in this meta-analysis measured SDS in a relatively explicit way, by using self-report questionnaires, even though implicit measures, such as IATs or priming tasks, are less prone to social-desirable responding than explicit measures of stereotypes, and are often better predictors of behavior (Gawronski & Bodenhausen, 2006). Thus, researchers should make use of more implicit tasks to assess SDS. Relatedly, previous research has used many different conceptualizations of SDS, sometimes combining attitudinal aspects with stereotypical aspects within one questionnaire. We advise future researchers to be more theory-driven in their conceptualization, operationalization, and predictions regarding SDS. For example, dual-process models (Gawronski & Creighton, 2013) or social cognition frameworks (e.g., Greenwald et al., 2002) could be used to further conceptualize different aspects of people’s SDS-cognitions, that is, implicit, explicit, attitudes, stereotypes, knowledge of stereotypes, prescriptive versus descriptive aspects, and personal versus collective aspects. New measures need to be developed and validated before we can examine the interplay between different double standard components.
Furthermore, studies assessing SDS via questionnaires sometimes used questionnaires that did not distinguish between people with reversed and egalitarian sexual standards. With such questionnaires, it is impossible to study predictors of individual differences in SDS-cognitions. When researchers would like to use a questionnaire in future studies on SDS, they should use questionnaires with symmetrical scales to assess the complete range of SDS from reversed to traditional (e.g., 20 item-pairs of the SDSS; Muehlenhard & Quackenbush, 1998) or develop new questionnaires that can assess the complete range.
Last, most studies included in this meta-analysis focused on SDS in behaviors associated with high sexual activity and only a few studies have been conducted specifically on behaviors associated with low sexual activity. However, further study of differences in the strength of traditional SDS between behaviors associated with high sexual activity (more male-typical) and behaviors associated with low sexual activity (more female-typical) is important. Such research can test whether boundaries for male-typical (sexual) behavior are more strict than for female-typical behavior (Hort et al., 1990). Also, research on how people acquire traditional SDS-cognitions now is essential for designing future interventions that foster egalitarian sexual standards and sexual equality for men and women.