Thursday, May 12, 2022

Low-intensity pleasure (familiar calm activities such as playing peek-a-boo) was the most influential variable in distinguishing boys from girls; girls came out higher on fear, falling reactivity, and low intensity pleasure, & boys higher on approach

Citation: Gartstein MA, Seamon DE, Mattera JA, Bosquet Enlow M, Wright RJ, Perez-Edgar K, et al. (2022) Using machine learning to understand age and gender classification based on infant temperament. PLoS ONE 17(4): e0266026.

Abstract: Age and gender differences are prominent in the temperament literature, with the former particularly salient in infancy and the latter noted as early as the first year of life. This study represents a meta-analysis utilizing Infant Behavior Questionnaire-Revised (IBQ-R) data collected across multiple laboratories (N = 4438) to overcome limitations of smaller samples in elucidating links among temperament, age, and gender in early childhood. Algorithmic modeling techniques were leveraged to discern the extent to which the 14 IBQ-R subscale scores accurately classified participating children as boys (n = 2,298) and girls (n = 2,093), and into three age groups: youngest (< 24 weeks; n = 1,102), mid-range (24 to 48 weeks; n = 2,557), and oldest (> 48 weeks; n = 779). Additionally, simultaneous classification into age and gender categories was performed, providing an opportunity to consider the extent to which gender differences in temperament are informed by infant age. Results indicated that overall age group classification was more accurate than child gender models, suggesting that age-related changes are more salient than gender differences in early childhood with respect to temperament attributes. However, gender-based classification was superior in the oldest age group, suggesting temperament differences between boys and girls are accentuated with development. Fear emerged as the subscale contributing to accurate classifications most notably overall. This study leads infancy research and meta-analytic investigations more broadly in a new direction as a methodological demonstration, and also provides most optimal comparative data for the IBQ-R based on the largest and most representative dataset to date.


We set out to leverage existing IBQ-R datasets from multiple laboratories (N = 4,438) to address an important gap in research by investigating age and gender classifications in early childhood, and overcoming limitations of the published studies such as small sample sizes that cannot be considered representative or provide widely generalizable results. Relying on algorithmic modeling techniques, 14 IBQ-R subscale scores served as features used to classify participating children as boys (n = 2,298) and girls (n = 2,093), and into three age groups: youngest (< 24 weeks; n = 1,102), mid-range (24 to 48 weeks; n = 2,557), and oldest (> 48 weeks; n = 779). Importantly, this approach allowed us to simultaneously classify infants into age and gender categories, providing an opportunity for the first time to consider the extent to which gender differences are informed by infant age. This study also makes an important contribution to the literature as a novel methodological demonstration. That is, the present application of machine learning algorithms provides a new direction for infancy and temperament research, as well as meta-analytic investigations more broadly.

Results based on accuracy indicators (the inverse of misclassification rates), Cohen’s kappa coefficients, and AUC (incorporating sensitivity and specificity parameters) demonstrated that temperament features provided superior classification of age groups relative to gender, which is consistent with the existing literature insofar as age effects have generally been more robust (e.g., not dependent on methodology; [5,26,52]). As noted, gender differences in infancy have been largely limited to activity level and fear/behavioral inhibition, with higher activity level and approach reported for boys [29,30] and greater fear/behavioral inhibition for girls [14,25,31,35,36]. These gender differences are somewhat controversial due to a lack of consensus regarding their origin (i.e., biologically based or largely a function of socialization; [53]) and questions regarding the role of parental expectations. That is, parents could rate boys and girls differently not due to actual variability in behavior but as a function of their own culturally influenced ideas about what is typical behavior in boys vs. girls. This explanation cannot be ruled out completely, although existing research suggests that gender differences are not entirely dependent on methodology (i.e., have been identified via behavioral observations along with parent report; [33,52]).

Importantly, gender classification by age groups results suggest this is most effective for the oldest age group, in line with the literature that indicates gender differences in temperament attributes become more pronounced with age [54]. Although a number of factors could be contributing to this pattern of results—accentuated gender differences in temperament with increasing age, and, conversely more accurate classification of gender with temperament features for oldest participants—socialization is often described as critical among these. The primary mechanism invoked in such explanations involves the infants’ interactional history, and is consistent with literature that indicates mothers respond differently to their sons and daughters [5559], presenting with different affordances as social interaction partners (e.g., [60]). Over time, such differences could result in divergent trajectories with respect to temperament due to differences in socialization goals/approaches for boys vs. girls. Specifically, parents may prioritize relationship orientation for daughters, but competence and autonomy for sons [6163]. These and other socialization-related pathways may be responsible for the stronger temperament-based classification of boys and girls later in infancy observed herein.

At the same time, gender is viewed as a marker for a host of sex-linked distinctions in physiological processes. For example, prenatal exposure to high levels of androgen is predictive of later behavior problems, primarily of the externalizing type (e.g., ADHD; [64]), and used to explain early vulnerability observed in boys with respect to this set of problems [65]. Postpartum biological effects are also possible, for example via testosterone increases for boys in infancy, referred to as “mini-puberty,” peaking by the second month and returning to baseline at about 6 months [66]. Sex-linked differentiation in brain structures and functions occurs with maturation, resulting in greater discrepancies with age. For example, Goldstein et al. [67] reported that the amygdala tends to be larger in males and the hippocampus larger in females (see Hines [68] for a related review).

Follow-up analyses outlining feature importance for classification models were performed for the Ensembled Decision Trees (Random Forest) to further interpretation of the observed results. Random Forest methods provide an effective mechanism for feature selection and importance using tree-based mechanisms to rank node classification via the mean decrease in gini impurity, i.e., the probability that a random sample in a particular tree node would be mislabeled using the distribution of the node sample, averaged across all trees [69]. Figures provided in Supplemental Materials (S1S3 Figs) demonstrate that while Fear was the most important feature in distinguishing boys and girls for the youngest and mid-range age group, for oldest infants, low intensity pleasure was most influential. In fact, for youngest infants (S3 Fig), all three distress-related scales (Fear, Distress to Limitations, Sadness) were of primary importance in classifying infants accurately by gender via the Random Forest algorithm. Positive emotionality and regulatory dimensions of temperament (e.g., Falling Reactivity, Approach) begin to take on greater importance for mid-range and oldest infants. Notably, certain temperament features detracted from model accuracy in classifying infants by gender (i.e., associated with lowest negative importance values), particularly Cuddliness, Vocal Reactivity, and Smiling and Laughter in the youngest age group and Smiling and Laughter, Perceptual Sensitivity, and Activity in the oldest age group. These results identify the temperament attributes that did not differentiate boys and girls effectively, and it is of interest that the list of these poorly differentiating features varied by age. When the most important features were considered for age classification and gender classification models only, Fear again emerged as the critical dimension, which is in line with the extensive literature documenting the developmental progression as well as gender differences for this domain of temperament [2,13,14,26,54].

This work is not without limitations, chief among these our reliance on a single method (i.e., parent report) in the assessment of infant temperament. Future studies should aggregate datasets providing different sources of information, including behavioral observations and physiological measures, such as cortisol reactivity, heart rate variability/respiratory sinus arrhythmia, and/or frontal alpha asymmetry ascertained via electroencephalogram (EEG) recordings. In addition, the outcomes examined in this study were limited to child gender and age. Future studies with older children should conduct classification analyses with additional dependent variables, particularly symptom and disorder classifications (e.g., clinical/subclinical/asymptomatic ADHD). It should be noted that we did not consider classification based on race/ethnicity because of a far more limited literature suggesting these differences can be discerned on the basis of temperament, and future research should examine related models, as relevant studies accumulate. Finally, the present modeling approach could be extended and potentially improved by applying ensembling modeling approaches (i.e., using multiple algorithms simultaneously), as opposed to relying on singular modeling frameworks.

This study underscores the importance of meta-analytic investigations and cross-laboratory collaborations, providing illusive answers to questions, such as those related to intersections of gender and age in temperament development, that have not been previously addressed. Because of the large cross-laboratory sample included herein, this study provides most optimal comparative data for the IBQ-R (Table 2), which has emerged as a widely used infant temperament assessment tool. Importantly, the present investigation serves as a methodological illustration for application of machine learning techniques in infancy and temperament research, as well as developmental science more broadly. Given the propensity for differing algorithmic methods to have strengths and weaknesses that may bias predictive outcomes and classification accuracy, we selected 11 established algorithmic modeling and classification techniques to quantify the most robust outcomes, simultaneously demonstrating the viability of machine learning approaches in this area of scientific inquiry. Results of this study make an important contribution to developmental temperament research, demonstrating effective age group classification on the basis of fine-grained temperament features, and indicating more effective gender classification for the older age group, with multiple implications for future mechanistic research examining potential socialization and biological contributors.

Brain size and IQ are positively correlated; however, multiple meta-analyses have led to considerable differences in summary effect estimations, thus failing to provide a plausible effect estimate

Of differing methods, disputed estimates and discordant interpretations: the meta-analytical multiverse of brain volume and IQ associations. Jakob Pietschnig, Daniel Gerdesmann, Michael Zeiler and Martin Voracek. Royal Society Open Science, May 11 2022.

Abstract: Brain size and IQ are positively correlated. However, multiple meta-analyses have led to considerable differences in summary effect estimations, thus failing to provide a plausible effect estimate. Here we aim at resolving this issue by providing the largest meta-analysis and systematic review so far of the brain volume and IQ association (86 studies; 454 effect sizes from k = 194 independent samples; N = 26 000+) in three cognitive ability domains (full-scale, verbal, performance IQ). By means of competing meta-analytical approaches as well as combinatorial and specification curve analyses, we show that most reasonable estimates for the brain size and IQ link yield r-values in the mid-0.20s, with the most extreme specifications yielding rs of 0.10 and 0.37. Summary effects appeared to be somewhat inflated due to selective reporting, and cross-temporally decreasing effect sizes indicated a confounding decline effect, with three quarters of the summary effect estimations according to any reasonable specification not exceeding r = 0.26, thus contrasting effect sizes were observed in some prior related, but individual, meta-analytical specifications. Brain size and IQ associations yielded r = 0.24, with the strongest effects observed for more g-loaded tests and in healthy samples that generalize across participant sex and age bands.

4. Discussion

In this quantitative research synthesis, we show that positive associations of in vivo brain volume with IQ are highly reproducible. This link is consistently observable regardless of which empirical studies are included in a formal meta-analysis and how they are analysed. Results of our analyses convergently indicate that the effect strength must be assumed to be small-to-moderate in size, with the best available estimates for healthy participants in full-scale IQ ranging from r = 0.24 (uncorrected; approximately 6% explained variance) to 0.29 (corrected approximately 8% explained variance). Effects for full-scale IQ appear to be stronger and more systematically related to moderators compared to verbal and performance IQ. However, these three intelligence domains are highly intercorrelated and their correlation with IQ test results are to be seen as manifestations of a largely similar true effect across domains. We, therefore, focus on full-scale IQ findings of healthy samples in our discussion, unless indicated otherwise.

4.1. Comparisons with previous meta-analyses

The strengths of the observed summary effects in the present meta-analysis correspond closely to those identified by Pietschnig et al. [24], although the number of participants in this updated analysis is more than three times larger. The observed association for full-scale IQ in healthy samples (i.e. corresponding to selection criteria of the meta-analyses from [25], and [23]) resulted in an estimate of r = 0.24 (95% CI [0.22; 0.27]), thus indicating considerably lower associations than those reported by Gignac & Bates [25]) and McDaniel [23]). Key characteristics of the available meta-analyses are summarized in table 5.

Table 5.

Characteristics of available meta-analyses on the in vivo brain volume and intelligence link. Note. k = number of independent samples in analysis; summary effect = best estimate according to authors of meta-analysis; when both Hedges & Olkin- as well as Hunter & Schmidt-typed analyses were performed, both estimates are provided, respectively.

It could be argued that these inconsistencies are to a certain extent due to the differing methodological focus of the used analyses because both meta-analyses of Gignac & Bates [25] and McDaniel [23] reported values that were corrected for direct range restriction. However, when we respecified our analyses to apply identical methods, full-scale IQ associations for healthy samples once more led to a lower estimate, yielding r = 0.29. This indicates that the reported estimates of prior Hunter & Schmidt-based syntheses were inflated (i.e. even before accounting for dissemination bias).

This idea is supported by our analyses of individual data subsets that used the very same specifications as these prior studies. For instance, Gignac & Bates [25] showed that IQ assessments with higher g-ness (i.e. reflecting abilities that are more closely related to psychometric g, thus providing a better representation of cognitive abilities) yielded larger associations than less g-loaded assessments. They concluded that the most salient estimate of the brain volume and IQ association averages r = 0.40 (i.e. corresponding to about 16% of explained variance), based on a specific subset of effect sizes that should provide the most credible results (i.e. using healthy samples, tests with excellent g-ness and attenuation-corrected effect sizes only).

None of the reasonable specifications that were included in our specification curve analysis yielded a summary effect that was larger than r = 0.37. Importantly, this most extreme upper value of all possible specifications was based on the very same inclusion criteria as the specification that is supposed to represent the best operationalization of this association according to Gignac & Bates [25], healthy samples, excellent g-ness, range departure corrected, Hunter & Schmidt estimator), excepting sample age (this uppermost value was based on children/adolescents only; the same specification with all ages yielded r = 0.34, corresponding to 11% of explained variance). This is important for a number of reasons.

First, it shows that the specification that was chosen by Gignac & Bates [25] leads to estimates in the extreme upper tail of the distribution of reasonable summary effects. Besides yielding uncharacteristically large values, these estimates have large confidence intervals (i.e. representing higher effect volatility), because they are based on comparatively small sample numbers. Results from our combinatorial meta-analyses showed that at least 75% (i.e. the bottom three quartiles) of results yielded values below r = 0.26.

Second, these findings suggest that the estimate reported in Gignac & Bates [25] must be considered to have been inflated, even when one was to assume that this extreme specification yields the most salient estimate for the brain volume and IQ association (i.e. the summary effect in [25], exceeds the upper threshold of any estimate of the present summary effect distribution). Third, the lower summary effects in the present analyses compared to the earlier estimate of Gignac & Bates [25], when identical specifications were used, indicate that the studies that were added in the present update of the literature reported lower correlations, thus conforming to a decline effect [21,22].

Consistent with this interpretation, publication years of primary studies predicted brain volume and IQ associations negatively, indicating decreasing effect sizes over time. Cross-temporally declining effect sizes have been demonstrated to be prevalent in psychological science in general and intelligence research in particular, especially when initial study sample sizes are small [22]. This means that early and small n (=imprecise) primary study reports represent more often than not overestimates of the brain size and IQ association, thus having led to inflated meta-analytic summary effects. The presently observed effect declines and comparatively large effect estimates of early small-n studies (e.g. [5]) are consistent with the decline effect and its assumed drivers.

4.2. Moderators

It is unsurprising that effects were typically stronger in healthy than in patient samples because the included patients suffered from different conditions that are likely to impair cognitive functioning (e.g. autism, brain traumas, schizophrenia) which is bound to introduce statistical noise into the data. Therefore, effects of moderators were substantially weaker and less unequivocal for patients than for healthy samples.

Consistent with Gignac & Bates [25], there were stronger associations with highly g-loaded tests compared to fairly g-loaded ones in healthy participants (uncorrected rs = 0.31 versus 0.19; Q(2) = 23.69; p < 0.001), but not in patient samples. These results were supported by the findings from our regression analyses where larger g-ness positively predicted effect sizes of healthy participants.

Within any examined subgroup, correlations that had been reported within publications were numerically larger than those that had been obtained through personal communications or from the grey literature. This suggests that correlations were selectively reported in the published literature although only differences in full-scale IQ associations of healthy samples reached nominal significance. This observation is consistent with effect inflation because larger associations are more likely than smaller ones to be numerically reported in the literature (numerically stronger effects are more likely to become significant—depending on sample sizes and accuracy—and therefore more likely to be published), thus potentially leading to inadequate assumptions of the readers about the effect strength. This finding is supported by results from our regression analyses that showed weaker effects of unpublished than published effect sizes. This suggests that the reported effects in the brain size and intelligence literature are more often inflated than not, thus conforming to results from Pietschnig et al. [24].

In a similar vein, publication years were negatively related to effect sizes, thus indicating a confounding decline effect [21] and conforming to cross-temporally decreasing effect sizes as reported in an earlier meta-analysis [24].

The only further moderator with consistent directions in terms of the observed association appeared to be measurement type which consistently yielded larger estimates for intracranial than for total brain volume, although these differences did not reach nominal significance (except for verbal IQ in patient samples). There were no consistent patterns in regard to age or sex in subgroup or regression analyses, thus conforming to a previous account that indicated that brain volume and IQ associations generalize over participant age bands and sex ([24]; but see [23], for conflicting findings).

4.3. Dissemination bias

Three of our formal methods for detecting dissemination yielded significant bias indications for both full-scale and performance IQ (Sterne & Egger's regression, Trim-and-Fill analysis, Copas & Shi's method), while only one method (Trim-and-Fill analysis) indicated bias in verbal IQ. The evidence for bias was stronger for full-scale than performance IQ. It should be noted, that both Sterne and Egger's regression, as well as the Trim-and-Fill analysis, are funnel plot asymmetry-based methods and consequently particularly sensitive for the detection of small-sample effects. This means that the detected bias seems to be rooted in the correspondingly large error variance of underpowered (i.e. small sample size) studies and is consistent with previously raised concerns about suboptimal power in neuroscientific research [201]. Viewed from this perspective, declining effect sizes over time appear to be somewhat reconciliatory, because this may well mean that average study power has increased in this field (or at least in studies addressing this research question).

The low observed replicability indices for all three domains further corroborate the evidence for effect inflation. Similarly, results of our effect estimations by means of p-value-based methods support the evidence for confounding dissemination bias, as previously observed in regard to this research question [24]. This interpretation is consistent with larger effects from published sources than from those that were obtained from the grey literature or personal communications, although these differences only reached nominal significance in meta-regressions, but not subgroup analyses.

The present findings contrast the conclusions of Gignac & Bates [25] who did not identify bias evidence in their analysis. This discrepancy may be due to two different causes.

On the one hand, Gignac & Bates [25] included unpublished results in the publication bias detection analyses (i.e. results that [24], had obtained from the grey literature or through personal communications with authors), which (i) prevent potential bias from detection and (ii) are conceptually unsuitable to be used in p-curve and p-uniform analyses [50,51]. On the other hand, different methods of dissemination bias detection are not equally sensitive for different forms of bias, thus necessitating a triangulation of methods for bias estimation according to current recommendations [42]. Relying on comparatively few and conceptually similar detection methods (i.e. publication bias tests of two p-value-based methods; p-curve and p-uniform; Henmi-Copas approach) may have contributed to the non-detection of bias evidence in this past meta-analysis [25], particularly because these methods are not suitable to detect small-sample effects.

Although the present findings indicate a presence of confounding publication bias, this should not be interpreted as evidence against a brain volume and IQ link. As pointed out above, these associations appear to generalize across numerous potential moderators and replicate well in terms of the identified direction. However, confounding dissemination bias suggests that the obtained summary effects in many primary studies (and even some meta-analyses) represent inflated estimates of the true association. However, it needs to be acknowledged that the future development of more reliable methods for assessing IQ on the one or in vivo brain volume on the other hand may lead to larger correlation estimates in primary studies. Nonetheless, the strength of the brain volume and IQ association must be considered to be small-to-medium-sized at best.

4.4. Significance of the observed effect

On the one hand, the strength of the observed summary effect suggests that effects of mere neuron numbers, glial cells, or brain reserve are unlikely candidates for the explanation of between-individuals intelligence differences. On the other hand, the effect is clearly non-trivial and has turned out to be remarkably reproducible in terms of its positive direction across a large number of primary studies. Consequently, brain volume should not be seen as a supervenient (i.e. one-to-one) but rather an isomorphic (i.e. many-to-one) proxy of human intelligence. This may mean that brain volume in its own right is too coarse of a measure to reliably predict intelligence differences. It seems likely that examining the role of functional aspects (e.g. white matter integrity) and more fine-grained structural elements (e.g. cortical thickness; see [2]) may help in further clarifying the neurobiological bases of human intelligence.

Wednesday, May 11, 2022

Older adults made more maladaptive episodic memory-guided social decisions, but not only because of poorer memory: Older adults were biased toward remembering people as being fair, while young adults were biased toward the opposite

Lempert, Karolina M., Michael S. Cohen, Kameron A. MacNear, Frances M. Reckers, Laura A. Zaneski, David Wolk, and Joe Kable. 2022. “Aging Is Associated with Maladaptive Episodic Memory-guided Social Decision-making.” PsyArXiv. May 11. doi:10.31234/

Abstract: Older adults are frequent targets of financial fraud. They may be especially susceptible to victimization because of age-related changes in both episodic memory and social motivation. Here we examined these factors in a context where adaptive social decision-making requires intact associative memory for previous social interactions. Older adults made more maladaptive episodic memory-guided social decisions, but not only because of poorer associative memory. Older adults were biased toward remembering people as being fair, while young adults were biased toward remembering people as being unfair. Holding memory constant, older adults engaged more with people that were familiar (regardless of the nature of the previous interaction), whereas young adults were prone to avoiding others that they remembered as being unfair. Finally, older adults were more influenced by facial appearances, choosing to interact with social partners that looked more generous, even though those perceptions were inconsistent with prior experience.

Meta-analysis: The COVID-19 pandemic was accompanied by only a small increase in loneliness

Ernst, M.eyt al. (2022). Loneliness before and during the COVID-19 pandemic: A systematic review with meta-analysis. American Psychologist, May 2022.

Abstract: The COVID-19 pandemic and measures aimed at its mitigation, such as physical distancing, have been discussed as risk factors for loneliness, which increases the risk of premature mortality and mental and physical health conditions. To ascertain whether loneliness has increased since the start of the pandemic, this study aimed to narratively and statistically synthesize relevant high-quality primary studies. This systematic review with meta-analysis was registered at PROSPERO (ID CRD42021246771). Searched databases were PubMed, PsycINFO, Cochrane Library/Central Register of Controlled Trials/EMBASE/CINAHL, Web of Science, the World Health Organization (WHO) COVID-19 database, supplemented by Google Scholar and citation searching (cutoff date of the systematic search December 5, 2021). Summary data from prospective research including loneliness assessments before and during the pandemic were extracted. Of 6,850 retrieved records, 34 studies (23 longitudinal, 9 pseudolongitudinal, 2 reporting both designs) on 215,026 participants were included. Risk of bias (RoB) was estimated using the risk of bias in non-randomised studies—of interventions (ROBINS-I) tool. Standardized mean differences (SMD, Hedges’ g) for continuous loneliness values and logOR for loneliness prevalence rates were calculated as pooled effect size estimators in random-effects meta-analyses. Pooling studies with longitudinal designs only (overall N = 45,734), loneliness scores (19 studies, SMD = 0.27 [95% confidence interval = 0.14–0.40], Z = 4.02, p < .001, I 2 = 98%) and prevalence rates (8 studies, logOR = 0.33 [0.04–0.62], Z = 2.25, p = .02, I 2 = 96%) increased relative to prepandemic times with small effect sizes. Results were robust with respect to studies’ overall RoB, pseudolongitudinal designs, timing of prepandemic assessments, and clinical populations. The heterogeneity of effects indicates a need to further investigate risk and protective factors as the pandemic progresses to inform targeted interventions.

Public Significance Statement: This synthesis of international research with a focus on longitudinal study designs shows small, but robust increases in loneliness during the COVID-19 pandemic across gender and age groups. As loneliness jeopardizes mental and physical health, these findings indicate that public health responses to the continuing pandemic should include monitoring of feelings of social connectedness and further research into risk and protective factors.

Keywords: COVID-19, loneliness, mental health, pandemic, social isolation


The main aim of this study was to summarize the most recent high-quality evidence for changes in loneliness in association with the COVID-19 pandemic in a systematic and rigorous way. The statistical synthesis focused on longitudinal study designs. The robustness of the results was tested and predictors of change in loneliness were also explored. Based on the pooled effect sizes of 19 studies, an overall increase in loneliness since the start of the pandemic (SMD = 0.27 [0.14–0.40] for continuous measures) was found. This constitutes a small (Cohen, 1992Ferguson, 2009) effect, which was also heterogeneous. An exploratory metaregression was modeled to statistically explain the observed variation.
The confidence in the finding that there has been an increase in loneliness—albeit small—during the pandemic is strengthened by the results of the sensitivity analyses, the inclusion of only high-quality and longitudinal research in the meta-analyses, the relatively large number of studies with a pooled sample of 45,734 participants, and the lack of any indication of publication bias.
A previous rapid review and meta-analysis (Prati & Mancini, 2021) reported small increases in mental distress (overall g = 0.17) based on longitudinal studies. It also included three studies concerning loneliness conducted in spring 2020 (Luchetti et al., 2020Niedzwiedz et al., 2021Tull et al., 2020), only one of which (Niedzwiedz et al., 2021) could be included in the main analyses of this review (another one (Luchetti et al., 2020) was included in a sensitivity analysis). Their synthesis showed no statistically significant change in loneliness (g = 0.12, p = .34). The present study expands on this rapid review by including more original studies from different countries with assessments later in the pandemic.
Another recent systematic review (Buecker & Horstmann, 2021), which did not synthesize its findings meta-analytically, reported based on 12 studies (three of which were included in this review (Bu et al., 2020Heidinger & Richter, 2020van Tilburg et al., 2020)) that most longitudinal investigations found increases in loneliness during the pandemic, which corresponds to the present findings. Studies showing decreasing loneliness had overwhelmingly relied on prepandemic assessments conducted shortly before the implementation of physical distancing, while those with comparison data from months or years before the pandemic had observed increased loneliness during the pandemic.
The present study extends previous knowledge on changes in loneliness during the pandemic; however, the observed increase needs to be interpreted with caution: On the one hand, loneliness can be considered a normal, nonpathological reaction to changing circumstances and many people experience it at some point in their lives. On the other hand, previous research has shown that particularly sustained or chronic loneliness jeopardizes mental and physical health (Cacioppo et al., 2015National Academies of Science, Engineering, & Medicine, 2020), and the ongoing pandemic and associated restrictions could compromise lonely individuals’ efforts to reconnect with others (Qualter et al., 2015).
Furthermore, the overall pooled effect in this study was small and the effect sizes reported by the individual studies were heterogeneous. The numerical values of effect size indices often provide limited understanding of the real-world significance of those effects, as even statistically small effects can be of high importance (e.g., Meyer et al., 2001). Interestingly, the most rigorous analysis (the sensitivity analyses that included only longitudinal study designs and studies with moderate RoB) showed a larger pooled effect size than the main analyses. This mirrors findings of the metaregression, in which studies’ higher RoB was negatively associated with the observed effect sizes. Taken together, these results suggest that the pooled effect in the present study might underestimate effects in at-risk populations.
The heterogeneity of effects might stem from the diversity of study characteristics included in prior research (e.g., age groups, healthy and clinical populations, regions, study designs, and loneliness measures). However, the fact that the metaregression accounted for less than a third of observed variance suggests that other factors may influence the different trajectories of loneliness in the pandemic context. As some original studies failed to report on previously identified vulnerable groups (e.g., individuals living alone), these could not be tested as predictors. Hence, more high-quality studies that assess risk and protective factors are needed so that their relevance can be assessed across samples. This is an important step to inform targeted prevention efforts.
The metaregression identified age, clinical populations, and studies’ overall RoB as predictors of increases in loneliness, but only overall RoB had statistically significant effects. However, the analysis might have been underpowered as it was not possible to test all predictors of interest simultaneously. While neither of two other available reviews conducted a metaregression to explore characteristics associated with changes in loneliness (Buecker & Horstmann, 2021Prati & Mancini, 2021), Prati and Mancini (2021) explored, using metaregression, predictors of increases in mental health symptoms during the pandemic. They found no effects of mean age, gender, or study design, either. More research is needed to better understand the mechanisms underlying observed changes in loneliness. They could include response biases such as social desirability or the perceived de-stigmatization of loneliness: learning that loneliness is an experience shared by many during the pandemic might make it easier to acknowledge and disclose one’s social needs.
Another question that should be addressed is whether changes in loneliness are primarily driven by changes in perceived relationship quality or quantity, and if this differs according to individual characteristics or in subpopulations (e.g., age groups). As a consequence, efforts aimed at preventing or reducing loneliness could pursue different strategies. For example, individuals who are lonely because they are socially isolated and have few contacts might benefit from programs fostering exchange, ideally across different living contexts and between generations. Previous research has shown positive effects of interventions enhancing social support (such as buddy-care programs; Masi et al., 2011). Within the pandemic context, these types of interventions could be carried out digitally or within small “social-support-bubbles.” Others might not feel that they have too few contacts overall, but instead be dissatisfied with their close relationships. Research has shown that people in conflictual relationships feel lonelier than those who perceive their relationships as supportive (Hsieh & Hawkley, 2018Selcuk & Ong, 2013). As the pandemic implicates a myriad of stressors affecting relationships, interventions could target the quality of partner relationships, parent–child relationships, or other configurations in which people live together, for example, through better communication (about feelings and worries, needs for support, etc.). Further approaches at the individual level might also focus on strategies to modify maladaptive social cognitions (which Masi et al. (2011) found to be the most effective). As individuals differ with respect to their ability to adapt to new situations, some might benefit from interventions aimed at changing attitudes and expectations regarding social contacts during a pandemic (e.g., regarding availability, spontaneity, and modality).
In general, prevention and intervention programs should address particularly vulnerable groups such as older individuals without internet access. Concerns have been raised about their lack of representation in large-scale, longitudinal investigations of loneliness (Dahlberg, 2021), so care must be taken to ensure that preventive measures address the needs and reach the breadth of the population instead of focusing on those who are most likely to be research participants. It should also be a research desideratum to include the most hard-to-reach members of the community.

Strengths and Limitations Including Constraints on Generality

The present study synthesized substantially more original reports than previous rapid and systematic reviews. The meta-analyses’ focus on longitudinal study designs is another strength. Besides peer-reviewed publications, this review included studies identified via other sources, for example, preprint servers (but no unpublished studies). In addition to longitudinal studies, pseudolongitudinal studies were included in the narrative synthesis and in the exploratory metaregression. However, the informative value of the metaregression was still hampered by the limited number of predictors that could be tested on the basis of the available studies (which also necessitated a stepwise procedure).
The lack of control samples unaffected by the pandemic weakens possible causal inference, making it more difficult to attribute the increase in loneliness to the pandemic. Furthermore, an alternative explanation for increases in loneliness in the population was recently provided by Buecker et al. (2021) who reported linear increases in emerging adults over the last decades. The discussion of underlying period and/or cohort effects included more flexible social (including romantic) relationships, use of communication technology, and occupational instability. At the same time, some of these trends resulting in individuals having many, but weak social ties may have particularly come into effect in the pandemic context.
RoB assessments revealed that most original reports had a serious RoB in at least one domain, for example, regarding the measurement of loneliness (including the use of untested single items or adaptations of questionnaires originally intended to measure other constructs). Although sensitivity analyses supported the results’ robustness with respect to studies’ overall RoB, the metaregression suggested that it could have led to an underestimation of the magnitude of changes in loneliness.
Further, some variables could only be included in the analyses in ways that reduced the complexity of original study designs/dynamic situations: First, the duration between loneliness assessments was often a range and not a concrete number of days/months. The present analyses used the respective midpoint of this range. For the duration of pandemic-related restrictions, the same procedure was employed. Restriction measures were coded based on official mandates, however, this might have been imprecise if measures differed between regions and/or if the assessment spanned a period in which these rules changed. There was also little information available regarding participants’ adherence to restrictions. Thus, in summary, the study design was not suited to determine effects of (specific) restrictions on loneliness. Furthermore, as the pandemic progressed differently around the world, we used regional cutoffs to distinguish whether study assessments had taken place before or during the pandemic, but individuals might also have been affected by restrictions outside their place of residence (e.g., travel bans). However, a sensitivity analysis confirmed the results’ robustness regarding findings of studies whose “prepandemic” assessment overlapped with the introduced cutoffs.
As included studies mainly derived from the U.S. and Europe, whereas South America, Asia/Oceania, and Africa were underrepresented, the present findings might not be generalizable to populations not conforming to the WEIRD (Western, educated, industrialized, rich, and democratic) stereotype (Henrich et al., 2010). Further, the original investigations might have omitted specific groups, such as immigrants not speaking the country’s official language, people with mental and/or physical disabilities, and those without regular internet access, if conducted online.

Unhappy people are more likely to choose unhappy lives & unhealthy people more likely to choose unhealthy ones: “better the devil you know, than the devil you don't”

“Better the devil you know”: Are stated preferences over health and happiness determined by how healthy and happy people are? Matthew D. Adler et al. Social Science & Medicine, May 10 2022, 115015.


• Most people want to be both happy and healthy, but which matters most?

• We trade-off levels of happiness and physical health in the UK and the US.

• Choices are determined by respondents' own levels of happiness and health.

• Information about adaptation to physical health conditions matters too.

• The results have implications for policymakers seeking to satisfy those preferences.

Abstract_ Most people want to be both happy and healthy. But which matters most when there is a trade-off between them? This paper addresses this question by asking 4000 members of the UK and US public to make various choices between being happy or being physically healthy. The results suggest that these trade-offs are determined in substantial part by the respondent's own levels of happiness and health, with unhappy people more likely to choose unhappy lives and unhealthy people more likely to choose unhealthy ones: “better the devil you know, than the devil you don't”. Age also plays an important role; older people are more likely to choose being healthy over being happy. Information about adaptation to physical health conditions matters too, but less so than respondent characteristics. These results further our understanding of public preferences with important implications for policymakers concerned with satisfying those preferences.

Keywords: HealthSubjective well-beingHappinessPreferences

Tuesday, May 10, 2022

GDPR induced the exit of about a third of available apps in the Google Play Store from 2016 to 2019; and in the quarters following implementation, entry of new apps fell by half

GDPR and the Lost Generation of Innovative Apps. Rebecca Jan├čen, Reinhold Kesler, Michael E. Kummer & Joel Waldfogel. NBER Working Paper 30028. May 2022. DOI 10.3386/w30028

Using data on 4.1 million apps at the Google Play Store from 2016 to 2019, we document that GDPR induced the exit of about a third of available apps; and in the quarters following implementation, entry of new apps fell by half. We estimate a structural model of demand and entry in the app market. Comparing long-run equilibria with and without GDPR, we find that GDPR reduces consumer surplus and aggregate app usage by about a third. Whatever the privacy benefits of GDPR, they come at substantial costs in foregone innovation.

7 Conclusion
GDPR has had substantial effects on Google’s app market. In the year following its implementation, about a third of existing apps exited the market; and following GDPR’s enactment, the rate of app entry fell by more than half. Moreover, GDPR-diminished entry cohorts account for 41 percent less app usage than their pre-GDPR counterparts, indicating that the missing apps would have been valuable. Finally, apps entering after GDPR have higher average usage per app, suggesting increased development costs. We incorporate these patterns into a structural model of app demand and entry, and we find that GDPR reduces consumer surplus, app usage, and – if revenue per user did not change – developer revenue by about a quarter. 

We have two broad conclusions, one about innovation in general and the other about GDPR in particular. First, we conclude that GDPR, whatever its beneficial impacts on privacy protection, also produced the unintended consequence of slowing innovation. It is possible that privacy is valuable to consumers in ways that do not manifest themselves in usage choices. Indeed, this is the “privacy paradox” that others (Acquisti et al., 2016; Norberg et al., 2007) have documented: Citizens clamor for privacy protections in ways that belie their behavior as consumers. We are hesitant to draw policy conclusions about the advisability of GDPR from our results alone. A full evaluation of GDPR requires a tallying of the potential beneficial effects on privacy, along with its various unintended consequences such as increases in market concentration (Batikas et al., 2020; Johnson et al., 2020), undermining revenue models for content production (Lefrere et al., 2020), and – here – reducing beneficial innovation.

Second, we take our findings as additional evidence that when product quality is unpredictable, the ease of entry is an important determinant of the ex post value of the choice set to consumers. Factors reducing entry costs deliver large welfare benefits, while factors hindering entry – such as GDPR – can deliver substantial welfare losses.