Quality of Government and Living Standards: Adjusting for the Efficiency of Public Spending. By Grigoli, Francesco; Ley, Eduardo
IMF Working Paper No. 12/182
Summary: It is generally acknowledged that the government’s output is difficult to define and its value is hard to measure. The practical solution, adopted by national accounts systems, is to equate output to input costs. However, several studies estimate significant inefficiencies in government activities (i.e., same output could be achieved with less inputs), implying that inputs are not a good approximation for outputs. If taken seriously, the next logical step is to purge from GDP the fraction of government inputs that is wasted. As differences in the quality of the public sector have a direct impact on citizens’ effective consumption of public and private goods and services, we must take them into account when computing a measure of living standards. We illustrate such a correction computing corrected per capita GDPs on the basis of two studies that estimate efficiency scores for several dimensions of government activities. We show that the correction could be significant, and rankings of living standards could be re-ordered as a result.
Despite its acknowledged shortcomings, GDP per capita is still the most commonly used summary indicator of living standards. Much of the policy advice provided by international organizations is based on macroeconomic magnitudes as shares of GDP, and framed on cross-country comparisons of per capita GDP. However, what GDP does actually measure may differ significantly across countries for several reasons. We focus here on a particular source for this heterogeneity: the quality of public spending. Broadly speaking, the ‘quality of public spending’ refers to the government’s effectiveness in transforming resources into socially valuable outputs. The opening quote highlights the disconnect between spending and value when the discipline of market transactions is missing.
Everywhere around the world, non-market government accounts for a big share of GDP and yet it is poorly measured—namely the value to users is assumed to equal the producer’s cost. Such a framework is deficient because it does not allow for changes in the amount of output produced per unit of input, that is, changes in productivity (for a recent review of this issue, see Atkinson and others, 2005). It also assumes that these inputs are fully used. To put it another way, standard national accounting assumes that government activities are on the best practice frontier. When this is not the case, there is an overstatement of national production. This, in turn, could result in misleading conclusions, particularly in cross-country comparisons, given that the size, scope, and performance of public sectors vary so widely.
Moreover, in the national accounts, this attributed non-market (government and non-profit sectors) “value added” is further allocated to the household sector as “actual consumption.” As Deaton and Heston (2008) put it: “[...] there are many countries around the world where government-provided health and education is inefficient, sometimes involving mass absenteeism by teachers and health workers [...] so that such ‘actual’ consumption is anything but actual. To count the salaries of AWOL government employees as ‘actual’ benefits to consumers adds statistical insult to original injury.” This “statistical insult” logically follows from the United Nations System of National Accounts (SNA) framework once ‘waste’ is classified as income—since national income must be either consumed or saved. Absent teachers and health care workers are all too common in many low-income countries (Chaudhury and Hammer, 2004; Kremer and others, 2005; Chaudhury and others, 2006; and World Bank, 2004). Beyond straight absenteeism, which is an extreme case, generally there are significant cross-country differences in the quality of public sector services. World Bank (2011) reports that in India, even though most children of primaryschool age are enrolled in school, 35 percent of them cannot read a simple paragraph and 41 percent cannot do a simple subtraction.
It must be acknowledged, nonetheless, that for many of government’s non-market services, the output is difficult to define, and without market prices the value of output is hard to measure. It is because of this that the practical solution adopted in the SNA is to equate output to input costs. This choice may be more adequate when using GDP to measure economic activity or factor employment than when using GDP to measure living standards.
Moving beyond this state of affairs, there are two alternative approaches. One is to try to find indicators for both output quantities and prices for direct measurement of some public outputs, as recommended in SNA 93 (but yet to be broadly implemented). The other is to correct the input costs to account for productive inefficiency, namely to purge from GDP the fraction of these inputs that is wasted. We focus here on the nature of this correction. As the differences in the quality of the public sector have a direct impact on citizens’ effective consumption of public and private goods and services, it seems natural to take them into account when computing a measure of living standards.
To illustrate, in a recent study, Afonso and others (2010) compute public sector efficiency scores for a group of countries and conclude that “[...] the highest-ranking country uses onethird of the inputs as the bottom ranking one to attain a certain public sector performance score. The average input scores suggest that countries could use around 45 per cent less resources to attain the same outcomes if they were fully efficient.” In this paper, we take such a statement to its logical conclusion. Once we acknowledge that the same output could be achieved with less inputs, output value cannot be equated to input costs. In other words, waste should not belong in the living-standards indicator—it still remains a cost of government but it must be purged from the value of government services. As noted, this adjustment is especially relevant for cross-country comparisons.
In this context, as noted, the standard practice is to equate the value of government outputs to its cost, notwithstanding the SNA 93 proposal to estimate government outputs directly. The value added that, say, public education contributes to GDP is based on the wage bill and other costs of providing education, such as outlays for utilities and school supplies. Similarly for public health, the wage bill of doctors, nurses and other medical staff and medical supplies measures largely comprises its value added. Thus, in the (pre-93) SNA used almost everywhere, non-market output, by definition, equals total costs. Yet the same costs support widely different levels of public output, depending on the quality of the public sector.
Note that value added is defined as payments to factors (labor and capital) and profits. Profits are assumed to be zero in the non-commercial public sector. As for the return to capital, in the current SNA used by most countries, public capital is attributed a net return of zero—i.e., the return from public capital is equated to its depreciation rate. This lack of a net return measure in the SNA is not due to a belief that the net return is actually zero, but to the difficulties of estimating the return.
Atkinson and others (2005, page 12) state some of the reasons behind current SNA practice: “Wide use of the convention that (output = input) reflects the difficulties in making alternative estimates. Simply stated, there are two major problems: (a) in the case of collective services such as defense or public administration, it is hard to identify the exact nature of the output, and (b) in the case of services supplied to individuals, such as health or education, it is hard to place a value on these services, as there is no market transaction.”
Murray (2010) also observes that studies of the government’s production activities, and their implications for the measurement of living standards, have long been ignored. He writes: “Looking back it is depressing that progress in understanding the production of public services has been so slow. In the market sector there is a long tradition of studying production functions, demand for inputs, average and marginal cost functions, elasticities of supply, productivity, and technical progress. The non-market sector has gone largely
unnoticed. In part this can be explained by general difficulties in measuring the output of services, whether public or private. But in part it must be explained by a completely different perspective on public and private services. Resource use for the production of public services has not been regarded as inputs into a production process, but as an end in itself, in the form of public consumption. Consequently, the production activity in the government sector has not been recognized.” (Our italics.)
The simple point that we make in this paper is that once it is recognized that the effectiveness of the government’s ‘production function’ varies significantly across countries, the simple convention of equating output value to input cost must be revisited. Thus, if we learn that the same output could be achieved with less inputs, it is more appropriate to credit GDP or GNI with the required inputs rather than with the actual inputs that include waste. While perceptions of government effectiveness vary widely among countries as, e.g., the World Bank’s Governance indicators attests (Kaufmann and others 2009), getting reliable measures of government actual effectiveness is a challenging task as we shall discuss below.
In physics, efficiency is defined as the ratio of useful work done to total energy expended, and the same general idea is associated with the term when discussing production. Economists simply replace ‘useful work’ by ‘outputs’ and ‘energy’ by ‘inputs.’ Technical efficiency means the adequate use of the available resources in order to obtain the maximum product. Why focus on technical efficiency and not other concepts of efficiency, such as price or allocative efficiency? Do we have enough evidence on public sector inefficiency to make the appropriate corrections?
The reason why we focus on technical efficiency in this preliminary inquiry is twofold. First, it corresponds to the concept of waste. Productive inefficiency implies that some inputs are wasted as more could have been produced with available inputs. In the case of allocative inefficiency, there could be a different allocation of resources that would make everyone better off but we cannot say that necessarily some resources are unused—although they are certainly not aligned with social preferences. Second, measuring technical inefficiency is easier and less controversial than measuring allocative inefficiency. To measure technical inefficiency, there are parametric and non-parametric methods allowing for construction of a best practice frontier. Inefficiency is then measured by the distance between this frontier and the actual input-output combination being assessed.
Indicators (or rather ranges of indicators) of inefficiency exist for the overall public sector and for specific activities such as education, healthcare, transportation, and other sectors. However, they are far from being uncontroversial. Sources of controversy include: omission of inputs and/or outputs, temporal lags needed to observe variations in the output indicators, choice of measures of outputs, and mixing outputs with outcomes. For example, many social and macroeconomic indicators impact health status beyond government spending (Spinks and Hollingsworth, 2009, and Joumard and others, 2010) and they should be taken into account. Most of the output indicators available show autocorrelation and changes in inputs typically take time to materialize into outputs’ variations. Also, there is a trend towards using outcome rather than output indicators for measuring the performance of the public sector. In health and education, efficiency studies have moved away from outputs (e.g., number of prenatal interventions) to outcomes (e.g., infant mortality rates). When cross-country analyses are involved, however, it must be acknowledged that differences in outcomes are explained not only by differences in public sector outputs but also differences in other environmental factors outside the public sector (e.g., culture, nutrition habits).
Empirical efficiency measurement methods first construct a reference technology based on observed input-output combinations, using econometric or linear programming methods. Next, they assess the distance of actual input-output combinations from the best-practice frontier. These distances, properly scaled, are called efficiency measures or scores. An inputbased efficiency measure informs us on the extent it is possible to reduce the amount of the inputs without reducing the level of output. Thus, an efficiency score, say, of 0.8 means that using best practices observed elsewhere, 80 percent of the inputs would suffice to produce the same output.
We base our corrections to GDP on the efficiency scores estimated in two papers: Afonso and others (2010) for several indicators referred to a set of 24 countries, and Evans and others (2000) focusing on health, for 191 countries based on WHO data. These studies employ techniques similar to those used in other studies, such as Gupta and Verhoeven (2001), Clements (2002), Carcillo and others (2007), and Joumard and others (2010).
? Afonso and others (2010) compute public sector performance and efficiency indicators (as performance weighted by the relevant expenditure needed to achieve it) for 24 EU and emerging economies. Using DEA, they conclude that on average countries could use 45 percent less resources to attain the same outcomes, and deliver an additional third of the fully efficient output if they were on the efficiency frontier. The study included an analysis of the efficiency of education and health spending that we use here.
? Evans and others (2000) estimate health efficiency scores for the 1993–1997 period for 191 countries, based on WHO data, using stochastic frontier methods. Two health outcomes measures are identified: the disability adjusted life expectancy (DALE) and a composite index of DALE, dispersion of child survival rate, responsiveness of the health care system, inequities in responsiveness, and fairness of financial contribution. The input measures are health expenditure and years of schooling with the addition of country fixed effects. Because of its large country coverage, this study is useful for illustrating the impact of the type of correction that we are discussing
We must note that ideally, we would like to base our corrections on input-based technical efficiency studies that deal exclusively with inputs and outputs, and do not bring outcomes into the analysis. The reason is that public sector outputs interact with other factors to produce outcomes, and here cross-country hetereogenity can play an important role driving cross-country differences in outcomes. Unfortunately, we have found no technical-efficiency studies covering a broad sample of countries that restrict themselves to input-output analysis. In particular, these two studies deal with a mix of outputs and outcomes. The results reported here should thus be seen as illustrative. Furthermore, it should be underscored that the level of “waste” that is identified for each particular country varies significantly across studies, which implies that any associated measures of GDP adjusting for this waste will also differ.
We have argued here that the current practice of estimating the value of the government’s non-market output by its input costs is not only unsatisfactory but also misleading in crosscountry comparisons of living standards. Since differences in the quality of the public sector have an impact on the population’s effective consumption and welfare, they must be taken into account in comparisons of living standards. We have performed illustrative corrections of the input costs to account for productive inefficiency, thus purging from GDP the fraction of these inputs that is wasted.
Our results suggest that the magnitude of the correction could be significant. When correcting for inefficiencies in the health and education sectors, the average loss for a set of 24 EU member states and emerging economies amounts to 4.1 percentage points of GDP. Sector-specific averages for education and health are 1.5 and 2.6 percentage points of GDP, implying that 32.6 and 65.0 percent of the inputs are wasted in the respective sectors. These corrections are reflected in the GDP-per-capita ranking, which gets reshuffled in 9 cases out of 24. In a hypothetical scenario where the inefficiency of the health sector is assumed to be representative of the public sector as a whole, the rank reordering would affect about 50 percent of the 93 countries in the sample, with 70 percent of it happening in the lower half of the original ranking. These results, however, should be interpreted with caution, as the purpose of this paper is to call attention to the issue, rather than to provide fine-tuned waste estimates.
A natural way forward involves finding indicators for both output quantities and prices for direct measurement of some public outputs. This is recommended in SNA 93 but has yet to be implemented in most countries. Moreover, in recent times there has been an increased interest in outcomes-based performance monitoring and evaluation of government activities (see Stiglitz and others, 2010). As argued also in Atkinson (2005), it will be important to measure not only public sector outputs but also outcomes, as the latter are what ultimately affect welfare. A step in this direction is suggested by Abraham and Mackie (2006) for the US, with the creation of “satellite” accounts in specific areas as education and health. These extend the accounting of the nation’s productive inputs and outputs, thereby taking into account specific aspects of non-market activities.