Thursday, November 3, 2022

Empirical Macroeconomics and DSGE Modeling in Statistical Perspective: Dismal forecasting errors + swapping data slightly impairs the model (and in 37% of cases the permutation of data make the model better)

Empirical Macroeconomics and DSGE Modeling in Statistical Perspective. Daniel J. McDonald, Cosma Rohilla Shalizi. Oct 31 2022.

Abstract: Dynamic stochastic general equilibrium (DSGE) models have been an ubiquitous, and controversial, part of macroeconomics for decades. In this paper, we approach DSGEs purely as statistical models. We do this by applying two common model validation checks to the canonical Smets and Wouters 2007 DSGE: (1) we simulate the model and see how well it can be estimated from its own simulation output, and (2) we see how well it can seem to fit nonsense data. We find that (1) even with centuries' worth of data, the model remains poorly estimated, and (2) when we swap series at random, so that (e.g.) what the model gets as the inflation rate is really hours worked, what it gets as hours worked is really investment, etc., the fit is often only slightly impaired, and in a large percentage of cases actually improves (even out of sample). Taken together, these findings cast serious doubt on the meaningfulness of parameter estimates for this DSGE, and on whether this specification represents anything structural about the economy. Constructively, our approaches can be used for model validation by anyone working with macroeconomic time series.

h/t Alex Tabarrok A Big and Embarrassing Challenge to DSGE Models Nov 3 2022


"If we take our estimated model and simulate several centuries of data from it, all in the stationary regime, and then re-estimate the model from the simulation, the results are disturbing. Forecasting error remains dismal and shrinks very slowly with the size of the data. Much the same is true of parameter estimates, with the important exception that many of the parameter estimates seem to be stuck around values which differ from the ones used to generate the data. These ill-behaved parameters include not just shock variances and autocorrelations, but also the “deep” ones whose presence is supposed to distinguish a micro-founded DSGE from mere time-series analysis or reduced-form regressions. All this happens in simulations where the model specification is correct, where the parameters are constant, and where the estimation can make use of centuries of stationary data, far more than will ever be available for the actual macroeconomy."

Now that is bad enough but I suppose one might argue that this is telling us something important about the world. Maybe the model is fine, it's just a sad fact that we can't uncover the true parameters even when we know the true model. Maybe but it gets worse. Much worse.

McDonald and Shalizi then swap variables and feed the model wages as if it were output and consumption as if it were wages and so forth. Now this should surely distort the model completely and produce nonsense. Right?

"If we randomly re-label the macroeconomic time series and feed them into the DSGE, the results are no more comforting. Much of the time we get a model which predicts the (permuted) data better than the model predicts the unpermuted data. Even if one disdains forecasting as end in itself, it is hard to see how this is at all compatible with a model capturing something — anything — essential about the structure of the economy. Perhaps even more disturbing, many of the parameters of the model are essentially unchanged under permutation, including “deep” parameters supposedly representing tastes, technologies and institutions."

Neither age nor intelligence is systematically related to wisdom; wisdom is correlated with openness, hedonic well-being, and eudaimonic well-being

Thirty Years of Psychological Wisdom Research: What We Know About the Correlates of an Ancient Concept. Mengxi Dong, Nic M. Weststrate, and Marc A. Fournier. Perspectives on Psychological Science, November 2, 2022.

Abstract: Psychologists have studied the ancient concept of wisdom for 3 decades. Nevertheless, apparent discrepancies in theories and empirical findings have left the nomological network of the construct unclear. Using multilevel meta-analyses, we summarized wisdom’s correlations with age, intelligence, the Big Five personality traits, narcissism, self-esteem, social desirability, and well-being. We furthermore examined whether these correlations were moderated by the general approach to conceptualizing and measuring wisdom (i.e., phenomenological wisdom as indexed by self-report vs. performative wisdom as indexed by performance ratings), by specific wisdom measures, and by variable-specific factors (e.g., age range, type of intelligence measures, and well-being type). Although phenomenological and performative approaches to conceptualizing and measuring wisdom had some unique correlates, both were correlated with openness, hedonic well-being, and eudaimonic well-being, especially the growth aspect of eudaimonic well-being. Differences between phenomenological and performative wisdom are discussed in terms of the differences between typical and maximal performance, self-ratings and observer ratings, and global and state wisdom. This article will help move the scientific study of wisdom forward by elucidating reliable wisdom correlates and by offering concrete suggestions for future empirical research based on the meta-analytic findings.


By meta-analyzing the extant literature, we summarized wisdom’s correlations with age, intelligence, the Big Five traits, narcissism, self-esteem, social desirability, and well-being. Although phenomenological and performative approaches to conceptualizing wisdom have their distinct correlates, both are correlated with openness, hedonic well-being, and eudaimonic well-being, especially the growth aspect of eudaimonic well-being. Transcending differences in conceptualizations and operationalizations of wisdom, these commonalities may reflect the fundamental characteristics of wisdom that are shared across theoretical perspectives. Specifically, wisdom entails being flexible in one’s thinking, the tendency and willingness to take on different ideas and perspectives, and an orientation toward exploration, psychological growth, and personal fulfillment. Furthermore, the results suggest that wisdom may indeed predict a good life, both in the hedonic and eudaimonic sense. Although not all forms of wisdom predict lives that are affectively positive, wiser individuals are ultimately happy, perhaps suggesting that wisdom may enable one to find contentment in life regardless of objective circumstances and one’s affective reactions to them. Importantly, the commonalities that we have identified through meta-analyses empirically corroborate earlier work (Glück, 2018Grossmann et al., 2020) in showing that the diverse theoretical traditions and measurement approaches are not to be taken as an indication that the construct of wisdom lacks validity; instead, they should be seen as attempts that, although each incomplete and imperfect on their own, capture different aspects of the same phenomenon. We believe that these findings will in turn help future efforts at designing wisdom measures by providing more reliable estimates of wisdom correlates that will help with the evaluation of convergent and discriminant validity.
Beyond the common correlates, however, the meta-analytic results paint two distinct portraits for phenomenological and performative wisdom. The portrait for phenomenological wisdom is one of adaptation and adjustment. Individuals who experience wise cognition, motivation, emotion, and behavior are uniquely more likely to report higher self-esteem, more positive affect, less negative affect, and greater life satisfaction and have an adaptive profile of personality traits, in which agreeableness, extraversion, and conscientiousness are high and neuroticism is low. As suggested by results of the supplementary analyses, this positive association between phenomenological wisdom and adjustment cannot be fully explained by methodological artifacts such as socially desirable responding. Instead, echoing previous theorizing (e.g., Ardelt, 2019), we argue that these correlations are at least in part substantive and reflect the nature of wisdom as subjectively experienced by individuals.
However, when wisdom is judged by other people through wisdom-relevant products, as is the case with performative wisdom, it is not associated with most of the indicators of adaptation. Intelligence, a cognitive ability, is relevant to at least some (i.e., the Berlin wisdom paradigm), although not all, indicators of performative wisdom. Notably, the association between intelligence and wisdom is the strongest for crystallized intelligence. Taken together with the findings that performative wisdom correlated with openness and the growth aspect of eudaimonic well-being, it appears that in the eyes of the beholder, wisdom entails not only one’s orientation toward thinking wisely but also one’s competence at doing so. We argue that, rather than being contradictory, the findings for phenomenological and performative wisdom are complementary to one another. Perhaps analogous to the distinction in creativity research between “little-c” creativity, or the everyday, subjectively defined form of creativity, and “big-C” creativity, or the consensually recognized form of creativity (e.g., Simonton, 2017), phenomenological wisdom may capture the everyday experiences of wisdom, but whether these subjective experiences are recognized as wise by other people is a different question, which is in turn captured by performative wisdom.
Surprisingly, neither phenomenological nor performative wisdom correlated negatively with narcissism, which should be theoretically antithetical to wisdom. For phenomenological wisdom, one possible explanation of the nonsignificant correlation may be that although narcissism may decrease the endorsement of communal items in self-report wisdom measures, it may enhance the endorsement of agentic items. This is because narcissists have been shown to have overly positive perceptions of their agentic traits (e.g., intelligence, creativity, adjustment) but have accurate perceptions of their low levels of communal traits (e.g., care, compassion, and morality; Carlson & Khafagy, 2018). The lack of significant correlation with performative wisdom is hard to explain because performative wisdom measures are unlikely to have been strongly affected by self-enhancement. Because very few studies have measured wisdom alongside with narcissism, the estimates of the current meta-analysis may not be reliable, and it is possible that a clearer pattern of the relationship between the two constructs will emerge after more empirical research. We suggest that, given its theoretical relevance, future research should look more into the relationship between wisdom and narcissism, and associations with narcissism may offer an opportunity to evaluate the validity and comprehensiveness of wisdom measures.

Reconciling the two forms of wisdom

Results of the current study necessitate a better understanding of the differences between phenomenological and performative wisdom. We speculate that three potential sources of these differences may be (a) the distinction between typical and maximal performance, (b) the distinction between self-ratings and other-ratings, and (c) the distinction between global and state wisdom.

Typical versus maximal performance

In the context of wisdom, maximal performance refers to how wise one can be, whereas typical performance refers to how wise one is in daily life. Maximal performance is episodic and is typically elicited when individuals know that their performance will be evaluated and so exert their full effort (Sackett et al., 1988). Although these conditions for maximal performance are not explicitly expressed in the instructions of performative wisdom measures, performative wisdom measures can reasonably be seen as measures of maximal, rather than typical, performance. This is because most extant measures of performative wisdom, especially those involving interviews with experimenters, press participants to think more thoroughly about the dilemmas through a series of standard questions. In addition, the task of working through challenging dilemmas in a lab setting may itself be enough to suggest evaluation to participants. Responding to phenomenological wisdom measures, on the other hand, typically entails recalling how one typically behaves in the past, across many situations. Even when phenomenological wisdom measures assess state-level wisdom, as is the case with the SWIS, it is likely that they capture typical, rather than maximal, performance, because there is no reason to believe that the situational contexts elicit full effort in these cases. The discrepancies between performative and phenomenological wisdom may therefore be exaggerated by the fact that one assesses maximal performance whereas the other assesses typical performance. This implies that the discrepancies may be reduced if performative wisdom can be compared to maximal levels of phenomenological wisdom and vice versa. Because no extant phenomenological wisdom measures assess maximal performance and no performative wisdom measures assess typical performance, the development of these scales may constitute promising areas of future research.

Self-ratings vs. other-ratings

Another source of difference between phenomenological and performative wisdom may be the fact that phenomenological wisdom is experienced, whereas performative wisdom is evaluated. All extant performative wisdom measures entail the evaluation of products of wisdom (i.e., participants’ attempts at thinking through a challenging dilemma), whereas phenomenological wisdom measures entail reporting one’s subjective experience of wisdom-related cognitions, motivations, emotions, and behaviors. A high correspondence between the two forms of wisdom therefore entails the successful translation of one’s subjective experience of wisdom into products of wisdom, which are then recognized by other people. It is conceivable that several factors may affect the success of this process, such as ability and knowledge. A high correspondence between subjective (phenomenological) measures and objective (performative) measures also implies a high level of self-knowledge accuracy. Because accurate self-knowledge is regarded as an essential aspect of wisdom (Mickler & Staudinger, 2008), it is possible that the discrepancy between phenomenological and performative wisdom is reduced for wise individuals, a possibility to be examined by future research.

Global versus state wisdom

In this meta-analytic study, we categorized measures of wisdom as capturing either phenomenological or performative wisdom. Phenomenological and performative wisdom are not only theoretically distinct but are also consistent with how wisdom measures cluster together in principal component analysis (e.g., Dong & Fournier, 2022). However, there are other distinctions among the wisdom measures. For instance, wisdom measures also differ in whether they assess state or global wisdom. Specifically, all performative wisdom measures included in this meta-analysis are measures of state wisdom because they assess wisdom performance in one or a few instances. Of the phenomenological wisdom measures, only the SWIS assesses state wisdom, whereas all other phenomenological wisdom measures included in this study assess global wisdom. It is conceivable that some of the differences between phenomenological and performative wisdom are attributable to the state versus global wisdom distinction. The moderate correlations among state wisdom in different situations (Brienza et al., 2018) may explain why performative wisdom measures showed more divergent patterns of correlations than phenomenological wisdom measures. State wisdom also only moderately correlates with global wisdom (Brienza et al., 2018), which may partly explain the finding that the SWIS was unlike the rest of the phenomenological wisdom measures in its correlations with many of the variables examined (i.e., conscientiousness, neuroticism, self-esteem, and negative affect).
The distinctions that we have observed between phenomenological and performative wisdom in the current study may therefore be due to a variety of reasons beyond disagreements among conceptualizations of wisdom. The assessment of typical versus maximal performance, the source of judgment (self vs. others), and the assessment of state versus global wisdom likely all contributed to the divergence between phenomenological and performative wisdom in their relationships with other variables. These factors should be taken into consideration when designing future empirical studies of wisdom.


The findings of the current study allow us to make a few suggestions for future research. The first of these suggestions concerns the selection of the proper wisdom measure(s) to administer in empirical studies. Although some studies have employed a battery of wisdom measures, encompassing measures of both phenomenological and performative wisdom to comprehensively assess the construct (e.g., Dong & Fournier, 2022Weststrate et al., 2018Weststrate & Glück, 2017b), such an approach is time-consuming, resource-intensive, and infeasible in many circumstances. Researchers are therefore faced with the decision of choosing one or a few wisdom measures to administer. In many cases, this decision seems to have been made based on the researchers’ knowledge of and familiarity with specific measures, rather than on a systematic evaluation of all available measures given one’s research goals, which can obfuscate the relationships of interest.
Based on the insights gained from the current study, we propose that the following questions should be considered when selecting wisdom measures for a study. First, one should identify the form of wisdom that should be assessed given the research question. Phenomenological wisdom may be more relevant for some research questions (e.g., whether one’s self-perception of one’s wisdom agrees with the perceptions of other people), whereas performative wisdom may be more relevant for other research questions (e.g., whether wisdom predicts more negotiation successes). In addition, it is important to consider whether state wisdom or global wisdom is more relevant. If one is interested in the relationships between wisdom and other variables in specific contexts, then it is more appropriate to administer state measures of wisdom. Conversely, if one is interested in assessing wisdom as a stable characteristic, then one can either administer global measures of wisdom or administer state measures of wisdom multiple times and use the average of states to approximate global wisdom. Second, it is important to consider the content of wisdom measures and how that may affect the results of the study. Ideally, the wisdom measure(s) chosen for a study should be relevant to the research question, but not so much so as to share common dimensions with other variables in the study. For instance, the SAWS showed the highest meta-analytic correlation with trait openness; however, this is likely because openness constitutes one dimension of the SAWS. Thus, if wisdom is to be examined in relation to openness, it may be advisable to avoid using the SAWS as the measure of wisdom because it may artificially inflate the relationship between the constructs.


The current study has several limitations. First, despite our best effort to gather relevant studies, it is unlikely that we have gathered all. Studies that were not in PsycINFO would have escaped the initial literature search. If these studies were not cited by one of the coded studies or submitted by their authors in response to our calls, then they would not have been included in the meta-analyses. Furthermore, some authors did not respond to our requests for submissions, so we were unable to obtain the relevant effect sizes that were not reported in the articles we gathered. There could also be relevant, unpublished data that were not submitted in response to our call. Given that the effect sizes meta-analyzed in the current study are only a subset of all relevant effect sizes, the results of the meta-analyses we present are only approximations of the true associations between wisdom and the criterion variables. Although we have no reason to believe that there were systematic differences between the studies included in the meta-analysis and those that were not, it is possible that the inclusion of additional studies would change the results of the meta-analyses. The results and conclusions of the current study should therefore be viewed as preliminary evidence, rather than final verdicts, on wisdom’s correlations with age, intelligence, the Big Five traits, narcissism, self-esteem, social desirability, and well-being.
Second, our meta-analyses were unable to address the more nuanced associations between wisdom and the criterion variables. For instance, previous studies have shown that the association between age and wisdom changes with age (e.g., Ardelt et al., 2019Brienza et al., 2018Webster, Westerhof, & Bohlmeijer, 2014). Although we have offered some preliminary evidence for this postulation by examining the moderating role of age range on the correlation between wisdom and age, the meta-analytic data and technique did not allow us to evaluate whether the association between age and wisdom followed a curvilinear relationship. Likewise, many researchers consider intelligence to be a necessary but not sufficient condition for wisdom (e.g., Glück, 2017Grossmann et al., 2020Staudinger & Pasupathi, 2003), which has already received some empirical support (Dong & Fournier, 2022Glück & Scherpf, 2022); however, we were unable to examine this postulation in the current study. Therefore, although the study provides insights into the rudimentary, linear relationships between wisdom and criterion variables, it is insufficient for a full understanding of these relationships.
Third, because of the small numbers of effect sizes and samples of participants, it was impossible to examine the interactions between the moderators reliably, leading us to decide against conducting such analyses in the current study. Moderators were tested one at a time and independently from each other. This meant that we were unable to address questions such as whether age range moderates the association between age and wisdom differently for different measures of wisdom or whether phenomenological and performative wisdom were differentially associated with crystallized and fluid intelligence. These questions are important and should be addressed by future meta-analytical attempts as more primary studies accumulate.
Fourth, we could not address the moderating role of culture in wisdom’s association with the criterion variables. This was primarily because of the difficulty in appropriately coding the culture of participant samples, as most samples included a mixture of ethnicities, indicating that they may not be uniform in culture. Moreover, most of the samples were collected in Europe and North America. Because other cultures were underrepresented, estimated cultural effects were unlikely to be reliable or accurate. Although the current study could not examine culture as a moderating variable, evidence suggests that culture may indeed play a moderating role in wisdom’s correlation with other variables (e.g., Grossmann et al., 2012). To date, relatively few studies have examined whether the correlates of wisdom change across cultures, a gap that should be addressed by future studies.

For both victims and perpetrators, infidelity was preceded (but not followed) by longer periods of decline in personal and relationship well-being

Estranged and Unhappy? Examining the Dynamics of Personal and Relationship Well-Being Surrounding Infidelity. Olga Stavrova, Tila Pronk, and Jaap Denissen. Psychological Science, November 2, 2022.

Abstract: Although relationship theories often describe infidelity as a damaging event in a couple’s life, it remains unclear whether relationship problems actually follow infidelity, precede it, or both. The analyses of dyadic panel data of adults in Germany including about 1,000 infidelity events showed that infidelity was preceded (but not followed) by a gradual decrease in relationship functioning in perpetrators and victims. There was little evidence of rebound effects in the aftermath of infidelity, with the exception of unfaithful women and individuals with lower initial relationship commitment who returned to the pre-event level of well-being or even exceeded it, providing support to the expectancy violation theory (vs. the investment model of infidelity). By showing that well-being starts to decline before infidelity happens, this study provides a differentiated view on the temporal dynamics of infidelity and well-being and contributes to the literature on romantic relationship dynamics and major life events.


We used prospective dyadic data to examine the temporal dynamics of personal and relationship well-being surrounding experiences of infidelity. Our analyses provided four main findings that we summarize below.
First, for the first time, we showed that infidelity events were preceded by a gradual decrease in personal and relationship well-being in victims and perpetrators, as evident in both actor and partner reports. In perpetrators, this decline might be a reason for starting an affair or even an intentional distress management strategy (see Scott et al., 2017). In victims, a decrease in well-being might be a result of feeling the partner’s dissatisfaction or represent a causal factor increasing their likelihood of being cheated on. Unhappiness has been associated with poor outcomes in social life in previous research (Lyubomirsky et al., 2005Stavrova & Luhmann, 2016). Hence, a decrease in personal well-being might make the future victim less attractive, contributing to the infidelity of the partner.
Second, in contrast to what most previous research on other negative interpersonal events (e.g., divorce, widowhood) indicated (Denissen et al., 2018Lucas, 2007Luhmann et al., 2012), infidelity events were not followed by steady recovery patterns. Although we detected small rebound effects with respect to some of the outcome variables, neither victims nor perpetrators were able to return to their initial levels of well-being. Potentially, the guilt and social disapproval associated with infidelity renders this event particularly difficult to recover from.
Third, puzzled by the lack of recovery patterns, we explored potential sources of between-individuals heterogeneity in responses to infidelity. We found that individuals who were more (vs. less) committed to the relationship before the event tended to experience a stronger deterioration in well-being after cheating or being cheated on. Their less committed counterparts, on the other hand, seemed to report an upward well-being trend following infidelity. This pattern is consistent with the expectancy violation theory (Burgoon, 1993): Higher commitment could be associated with higher relationship expectations and stronger disappointment when the expectations are violated.
Interestingly, our exploratory analyses detected one more group of participants who seem to recover and even thrive after infidelity, other than individuals with low relationship commitment: unfaithful women. Women (vs. men) are more likely to mention relationship dissatisfaction as a reason for their affair (Barta & Kiene, 2005), and prior research has shown that acts of infidelity committed because of relationship problems can lead to positive psychological outcomes (Beltrán-Morillas et al., 2020). Potentially, women’s affairs are more likely to be a result of partner dissatisfaction, and consequently, the affair may be a wake-up call for their partners, leading to positive behavioral change. These findings add to the small literature exploring the conditions in which infidelity might have positive consequences (Beltrán-Morillas et al., 2020Thompson et al., 2021).
Finally, the inclusion of actor and partner outcomes in both victim and perpetrator samples resulted in several potentially interesting observations. Negative well-being consequences (i.e., post-event baseline change) appeared more common in perpetrators who reported cheating themselves (i.e., actor well-being in the perpetrator sample) than in perpetrators whose partner reported cheating (i.e., partner well-being in the victim sample) and in victims (see Figs. 2 and 3). Although this could be partially explained by differences in power (for sensitivity analyses, see the Supplemental Material), the nature of infidelity—disclosed versus secret—could have played a role, too. Disclosed infidelity was presumably more common in the victim sample (as it was reported by the victims) than in the perpetrator sample (as it was reported by the perpetrators). This is consistent with the perpetrator sample being almost twice as large as the victim sample, where secret affairs were probably unreported.
Potentially, perpetrators are more negatively affected by infidelity when it is kept secret (i.e., actor effects in the perpetrator sample) versus disclosed (i.e., partner effects in the victim sample). Disclosing infidelity can help some couples find a solution to the relationship problems that led to infidelity in the first place (Atkins et al., 2005). The higher share of secret affairs in the perpetrator sample versus victim sample could also explain why perpetrators and their partners had chronically lower personal and relationship well-being, relative to the control sample, whereas neither victims of infidelity nor their partners differed from the control sample (selection effects; see Fig. 1). It should be noted that in the absence of the explicit information regarding infidelity disclosure rates, this interpretation remains speculative. Future research should test to what extent the perpetrator-victim differences in the present study are a result of differences in disclosure versus perpetrator/victim status.

Limitations and future directions

The reliance on large-scale panel data resulted in many benefits: It allowed us to identify a high number (~1,000) of infidelity events, track them for several years before and after infidelity, and compare the relationship trajectories of participants who experienced infidelity with a large control sample of individuals who did not (~1,500). However, the reliance on these secondary data restricted our ability to influence sampling (e.g., Germany) and measurement decisions, resulting in several limitations. The lack of information regarding whether the infidelity has come to light or not is one of them (as discussed above). In addition, the phrasing of the infidelity measure (“extra-marital affair”) could have left room for different interpretations (e.g., extradyadic sex vs. an online flirt) and included consensual nonmonogamous relationships. Comparing the effects of different infidelity types as well as examining whether changes in different aspects of relationship functioning could lead to different types of infidelity could be an interesting endeavor for future studies.

Things become more valuable to us merely by virtue of the fact that we possess them

Owning leads to valuing: Meta-analysis of the mere ownership effect. Michał Białek, Yajing Gao, Donna Yao, Gilad Feldman. European Journal of Social Psychology, November 2 2022.

Abstract: Mere ownership effect is the phenomenon that people tend to value what they own more than what they do not own. This classic effect is considered robust, yet effect sizes vary across studies, and the effect is often confused for or confounded with other classic phenomena, such as endowment or mere exposure effects. We conducted a pre-registered meta-analysis of 26 samples published before 2019 (N = 3024), which resulted in psychological ownership on valuing effect of g ∼ 0.57 [0.46, 0.69]. Suggestive moderator analyses supported the use of replica as the strongest moderators. Mere ownership effects were different from the null across all moderator categories and in most publication bias adjustments. We consider this as suggestive evidence that psychological owning leads to valuing, yet caution that much more research is needed. All materials, data, and codes are available on