Radstats How Should A Statistical Agency Meet Its Political Responsibilities?

Radical
Statistics

How Should A Statistical Agency Meet Its Political Responsibilities?

Kenneth Prewitt

The responsibilities of statistical agencies have been codified in Guidelines, Declarations of Principles and Practice, Codes of Conduct, White Papers and similar statements. Although emphases vary from country to country, the basic themes are similar. Statements cover professional staffing, methodological transparency, standardisation and predictability in release of data products, open dissemination, protection of respondent confidentiality, forthrightness about error structure, ethical standards to prevent misuse of data, and independence from political influence or manipulation (Seltzer, 1994).

These widely shared principles are the starting point for reviewing the more 'political' aspects of how agencies discharge their responsibilities. We consider three arenas: national statistics as a public good, statistics in the policy process and the use of statistics in allocating political power. A final section takes up an issue which, though outside the normal purview of statistical agencies, extends the analysis.

National Statistics: A Public Good

Public statistics easily meet the criteria of a public good. They are non-rival in their consumption, with the use by some not diminishing the use by others. The market does not provide them; the market invests in expensive information gathering only if the data are proprietary and thus can be used competitively.

We take for granted that publicly provided statistics are valuable to the society well beyond their immediate administrative and political uses by the state. The empirical social sciences, from which we get much of the social intelligence in our society, would not have reached current level of maturity in the absence of public statistics. That is, demography and vital statistics, economics and economic indicators, political science and voting data, sociology and poverty or migration or race/ethnicity statistics, and, of course, a great deal of historical analysis of the last two centuries. Increasingly, as well, the media covers its chosen topics through the prism of statistical portraits and social indicators.

Though scholars and journalists make the most systematic use of national statistics, other uses also qualify under the heading of 'public goods'. Businesses routinely mine statistical information in making decisions about matters such as plant location or in assessing how changing demographic profiles might shape consumer patterns. Charities, churches, and other groups focused on social reform or service delivery will also turn to public statistics as they plan membership strategies or target their programs.

Public users are sufficiently adept that practically any general statistics will be put to use somewhere, somehow. That is, even data generated for narrow state administrative functions can be made to serve a public good purpose. This fact notwithstanding, the 'responsible' statistical agency knows that it produces a public good and consequently will, as best it can, resist when its government paymasters attempt to limit data collection to the needs of administrative and regulatory agencies.

To see the larger significance of statistics as a public good we take a quick detour. The idea of electoral accountability has a privileged place in democratic theory, rightly so. Competitive elections offer to the electorate alternative portrayals of how well the government-in-power has performed or will perform against future challenges. Aspirants for political office present themselves in terms of their past achievements and their promises of future accomplishments in managing the economy, protecting the nation's security, and generally enhancing national well- being. Voters then elect, re-elect or evict accordingly (Prewitt, 1987).

This simple model presumes that the voters have some way of assessing government performance. Statistical trends and social indicators play a major role. They indicate whether the economy is growing or stagnating, whether education or health or housing is improving, whether crime rates are down, whether the environment is being protected, and so forth. They tell the voter how well the country is doing and upward or downward ticks in statistical series are grist for assertions and rebuttals about whom can take credit for improvements, or should be blamed for failure.

This broad view on the role of statistics in democracy was recognised, for instance, in Statistics: A Matter of Trust (2000) where we read that government statistics should inform 'the people generally about the state of the nation and provide a window on the work and performance of government, allowing the impact of government policies and actions to be assessed'.

In a second, and closely related fashion, democracy is promoted by statistics. How do issues reach the political agenda, especially issues that matter most to the unorganised or disenfranchised interests of society? For instance, how do issues such as child labour or male-female wage differentials become politically salient? One route to the political agenda is though social reform action that takes as its starting point statistical profiles. Since the early days of industrialisation, reform groups have used statistics to document the social conditions that they were committed to eradicating. Examples are poverty laws, prison reform, child abuse, and racial discrimination in education, housing and employment. Reform activists mobilise political participation and inform public debate by transforming previously unnoticed social conditions into highly visible social injustices. In this way resource-poor groups compensate by using state provided statistical information as a political resource.

Of course the government-in-power does not always welcome having its performance assessed by the electorate or its agenda shaped by social reform movements. When the government is sufficiently discomforted by statistics that it controls, the incentive is strong to minimise their impact.

This of course is why declarations about the principles and practices of statistical agencies stress independence and integrity. Statistical agencies themselves generally need no reminding of these principles. What they need are governments which honour them.

Responsibilities (1)

What, then, are the basic responsibilities of the statistical agency in view of the 'public good' function of national statistics?

First, keep firmly in mind the legitimate needs and preferences of data users outside the government. In presenting statistical programs for public funding, remind ministries and parliamentary committees that information is a public good and that an enlightened government must accept that the criteria for supporting statistics reaches beyond the immediate administrative needs of the state.

Second, be guided by the deep principle that statistics, especially as converted into social and economic indicators, advance democratic accountability. This principle gives added weight to the importance routinely attached to the independence and integrity of national statistics.

National Statistics: Public Policy

The state's interest in taxation and military conscription has hovered in the background of every census since biblical times. Today, of course, it is a much broader array of public policies that are shaped by, and justified with, national statistics. The public policy uses of statistics range across every domain: public transportation, health and education services, job training, police and fire protection, sanitation and sewage disposal, housing and land-use, to name a few (Alonso and Starr, 1987).

Thus the state measures the economic life of the nation, using that information to help design a taxation system - whether to tax manufacturing activity, natural resources, individual income, property, at what rates and in what combinations. For example, census based population projections are used by agencies that design social security tax rates and estimate future outlays.

The social welfare state has an especially large appetite for statistics, primarily so it can target benefits to such specific population groups as families in sub-standard housing, children in poverty, the physically disabled, minorities historically disadvantaged or mistreated, victims of particular diseases.

What are generally called 'affirmative action' policies designed to compensate for past injustices make use of statistical profiles. When the state wants to ensure that its legislature is demographically representative of the population at large, for instance, census data are used to monitor whether election districts are drawn such that racial minorities have the opportunity to elect someone from their 'race'. More generally, the principle of statistical proportionality is employed to ensure that racial or ethnic groups historically discriminated against are given educational and employment opportunities or subsidised housing and health care.

Responsibilities (2)

What, then, are the basic responsibilities of the statistical agency in view of their centrality in public-policy design and implementation?

The familiar principles of timeliness, accuracy, transparency and universal dissemination grow in importance because public policies will be neither fair nor effective in the absence of quality statistics. This truism places a heavy responsibility on the statistical agency. It views its accountability not just to the ministry in which it is housed or the parliament that grants its funds but to the citizens whose well being varies with the quality of the nation's public policies. Though this perspective on accountability does not require principles different from those already governing most agencies, it might require a fresh and enlarged understanding of what it is that those principles are in service of.

One way to formulate this enlarged understanding is to recognise that statistical results will and should be used by both sides in political debates over given public policies. The statistics, then, must be collected and presented in a manner that can never be interpreted as having an a priori bias toward a given policy outcome. As we see below, in a different context, this principle is easier in its proposing than in its practice.

Statistical agency responsibilities in the public policy arena have an even broader ethical burden when we consider the possible misuse of statistics in the social control function of the state.

Social control policies that police borders and population movement, and even reproduction and fertility are fashioned in part on the basis of statistical information. The boundary between routine social control and abusive exercise of state power is not easily drawn, but certainly history offers too many examples of national statistics as an instrument of the latter. Family planning policy and fertility regulation in China and India can be cited. Population displacement is a further example, as census data are used to place restrictions on freedom of movement or to force population groups to live in places they would not have chosen. The use of U.S. census data to facilitate the internment of Japanese-Americans at the beginning of WW II is a sad instance. More tragic has been the use of national statistics in genocidal policies. Population registers and special censuses identifying Jews greatly assisted the Nazis in locating them and arranging transport to concentration camps (Seltzer, 1998).

Here the responsibility of the statistical agency is obvious. In the first instance it must have strong confidentiality policies. As is so often noted, the collection of information from the population requires a high degree of public confidence that promises of confidentiality will be honoured. That confidence is hard to earn, hard to keep. In many countries there is public scepticism - we recall the experience of the aborted German census in the early 1980's, we note the sharp public backlash in Canada over the consolidated database that was being assembled, and the privacy battles that emerged in the midst of the recent U.S. census. That we are entering a new era of privacy concerns, generated of course by the rapid growth of e-commerce and web surfing, affects every statistical operation: data collection, storage, consolidation, dissemination. The insatiable need for information in a 'knowledge economy' confronts the high value a free society places on privacy, and national statistics are caught in the crossfire. Never has the promise and practice of confidentiality been so necessary.

The gross misuse of statistical information cited above suggests that the standard confidentiality provisions will not always suffice. The public expects more than confidentiality. There must be guarantees that no information provided in a government statistical operation can ever be used against those who provide the data. This is difficult to guarantee. Protections beyond confidentiality policies should be considered, such as disclosure avoidance practices that make the data unreliable for the identification of individuals and their locations.

In extreme instances, when national security is at risk, the government may conclude that a statistical agencies micro data would be useful and make demands difficult for the agency to deny. Because the statistical agency does not make the laws or the policies the most important guarantees of confidentiality have to come from the government itself. National parliaments should examine whether they have adequate legal prohibitions against using statistical information to bring harm on individuals. Statistical agencies should advocate and actively co-operate with policies that protect individuals from misuse of data it collects. The ultimate responsibility, however, is government wide.

National Statistics: allocating power

Of course the fundamental principle that national statistics must be free of and be seen to be free of political interference informs the two arenas of responsibility now reviewed - the public good and public policy.

That principle is not less important but certainly becomes more nuanced as we turn attention to the role of statistics in distributing political power and resources. Here the United States, and in particular its decennial census, will be used as the case-in-point. A several decade-long partisan battle throws into sharp relief the complexities that occur when a given statistical operation has a specific political purpose.

The story starts with the U. S. Constitution of 1787, wherein is the provision for a decennial census so that seats in the lower house of congress can be 'apportioned among the several States which may be included within this Union, according to their respective Numbers'. Distributing power among the states was to occur every 10 years, both to reallocate as the numerical strength of states shifted with population growth and movement and to account for the new states expected to join the Union as the western regions of the country were settled.

In serving these state-building functions the decennial census was intended to be and has been 'political'. What could be more political than allocating power from one part of the country, with its economic and social interests, to another part, perhaps with different and even conflicting interests - such as shifting power from the slave-holding South to the new Western states based on free labour?

The decennial census worked as anticipated. Every decade a census was taken. Every decade, with one exception, political power was reallocated as the population grew and moved westward. The exception occurred after the 1920 census reported the massive wartime population movement from the rural and conservative south to northern cities where liberal political interests dominated. Southern interests were strong in the Congress and reapportionment was delayed for more than a decade (Anderson, 1988).

The refusal to reapportion in 1920 is a useful comparative referent point for the census politics that emerged in 1980. The political debate in the 1920s pitted region against region, and focused on the use of census results. In this, 1920 was similar to previous battles over the census that extended across the 19th century.

Starting in 1980 and peaking with the 2000 census a different pattern prevails. The battle is strictly along political party lines and is being waged over census methods more than results. Or, to frame this last and critical point differently, the battle is about how the methods might affect the results (Anderson and Feinberg, 1999).

A combination of events in earlier decades set in motion the current partisan battles. A key event was the identification of the magnitude of undercoverage in the 1940 census and, especially, how that undercoverage was differently distributed across geographic areas and demographic groups. The limited data available for this research (vital statistics at first, and later data generated from the decennial short form) showed that the groups least well counted were racial minorities. Racial justice and the methodology of census taking converged.

Two developments in the 1960s further raised the political stakes. One was the sharp expansion of federal funds allocated on the basis of spending formula using census data. The other was the passage of civil rights legislation that used census data to document policies and practices that discriminated against racial minorities, especially their access to the ballot box.

To the original constitutional purpose of reapportioning the Lower House of congress, these additional political benefits - federal funds and civil rights - are linked with census results. And these political benefits are largely distributed on a share or zero-sum basis. If one state gains a congressional seat, another state loses one. If one area receives more federal funds, other areas receive less. In distributional politics, a differential census undercount is quickly judged to be unacceptable. The area undercounted relative to other areas is deprived of equal representation in legislative bodies and deprived of its appropriate share of federal funds. When demographic groups are concentrated in undercounted areas, such as racial minorities in large cities, the undercount plays into electoral politics. This is the case in the U. S.

The Census Bureau, supported by a sizeable number of professional statisticians and survey methodologists, concluded that the persistent differential undercount could not be eliminated by simply counting better. Undercoverage is a fact of any census and there is reason to expect that coverage errors will not be evenly distributed across social groups and areas of the country.

In seeking a solution to the differential undercount the Census Bureau turned to dual system estimation. This methodology was used in the 1980 census, but the Census Bureau judged that the methodology was not yet robust enough to correct the initial census counts. It withstood legal challenges that would have mandated these corrections.

An improved dual system estimation methodology was used in the 1990 census, and this time the Census Bureau was confident that its application would make the census count more accurate. Its decision, however, was overruled by the Secretary of Commerce (the U. S. Census Bureau is housed in the Department of Commerce). His reasoning included the following passage:

... the choice of the adjustment method selected by the Bureau officials can make a difference in apportionment, and the political outcome of that choice can be known in advance. I am confident that political considerations played no role in the Census Bureau's choice of an adjustment model for the 1990 census. I am deeply concerned, however, that adjustment would open the door to political tampering with the census in the future.

This passage is the first instance in American history wherein a high government official gives voice to the spectre that the non-partisan, professionally managed Census Bureau might choose a data collection methodology so as to favour one political party over another. Very quickly the conditional tense was dropped, and it is now commonplace to encounter the charge that Census 2000 has been designed to benefit one political party and penalise the other.

A number of congressional votes on census methodology have been strictly along party lines; there has been one presidential veto and threats of others and there have been numerous votes in state legislatures, again on party lines. There has even been litigation, with census methods twice reaching the Supreme Court in this decade.

This is the context for further examining the principles of responsibility. The decennial census allocates legislative seats. Political leaders argue that this allocation is affected by the differential undercount. The technical solution to correct for the undercount, dual system estimation, is thought by the Democratic Party to improve its political fortunes and by the Republican Party to be harmful. Consequently, census methodology has become the subject of an intense partisan battle.

Responsibilities (3)

Up to this point of the analysis, issues of responsibility have been comparatively straightforward. The statistical agency should distance itself from partisan politics, discharging its responsibilities by providing accurate data to a broad array of users inside and outside the government. But the case of the U. S. census is about counts that rearrange the distribution of partisan interests, and thus become proper objects of partisan attention. What, now, constitutes 'responsibility?'

We take for granted that the agency will select methods and reporting procedures without regard to their consequences for partisan outcomes. For instance, nothing in the practices, principles, or policies of the U. S. Census Bureau has ever or can be cited as evidence that it designed the 2000 decennial with intent to influence partisan outcomes.

In a highly charged partisan environment, these principles and practices will not always be accepted at face value. A statistical agency must take the additional step of being unusually transparent. It may have to be more open than prudent statistical practice suggests. In conducting the decennial census in 2000, the U. S. Census Bureau has prespecified its procedures, operations, and design choices far beyond normal practice. From a statistical perspective, the data patterns generated by dual system estimation should be examined before making final choices about, for example, how discrete post-strata (grouping the population into strata on the basis of their coverage probabilities in the census) might be combined. The Census Bureau, however, has prespecified its post-strata in a manner that discourages subsequent collapsing even if that makes statistical sense.

From a political perspective, this prespecification increases congressional confidence that the design is without partisan political intent. Consequently, in this instance the Census Bureau foregoes best statistical practice in order to discharge its responsibilities not just to be but to be seen to be politically neutral.

Transparency involves more than prespecification. In the case of the U. S. census it involved co-operating with and even inviting a level of public scrutiny unprecedented in the agency's history, and probably unprecedented for any large-scale statistical operation. The Census Bureau made available a terabyte of real-time operational information; it provided briefings to congressional oversight committees and their staffs on nearly a weekly basis; it met frequently with a half-dozen advisory committees; it gave regular operational press briefings; it was subjected to ongoing scrutiny by the Government Accounting Office of the Congress, by the Inspector General of its parent ministry, and by a special Census Monitoring Board that reported to the Congress and the Administration. Several hundred independent investigators, auditors, legislative staffers, and other overseers had access to all census operations.

Obviously the level of oversight deflected management time and resources that otherwise would have focused on census operations themselves. As with prespecification, however, what might not make sense operationally was important in demonstrating that the census was conducted without partisan intent.

That is, a statistical program that has critical partisan consequences must be designed and conducted in a manner to persuade partisan interests that it is politically neutral in its intent even though its results are not neutral.

This principle keeps us in the familiar, and comfortable, territory of the independence of statistical operations from political interference. It is not enough.

In Conclusion: A Counter-Intuitive Observation

In the case under consideration - the decennial census of the U. S. - a statistical result redistributes political power. The U. S. case is not unique. There are dramatic instances when statistical results might have upset power arrangements, and the data were suppressed. Both China and the Soviet Union provide examples. And there are many, many quieter instances scattered across the landscape where statistics and politics interact. What is called for is a principle of responsibility that can be invoked when it is known before hand that a statistical result will redistribute power.

The overriding goal must be to protect scientific method. This is not easy. When a scientific result is applied to something so inherently conflictual as the allocation of political power, institutional theory from the social sciences tells us that this will push any partisan battle back to the pre-result stage - that is, to the scientific method that generates the result. How to prevent this?

One solution is to avoid using the particular method that invites the partisan fight. In the U. S. case, and no doubt many others, this would not be easy. Political interests that believe they will benefit from census counts corrected by dual system estimation will argue for and even litigate to assure that they are used. The Census Bureau could not unilaterally select against a methodology that happens to be opposed by one side in a partisan argument. Such a choice would, rightly, be said to be made on political and not statistical grounds.

There is a second though much more complicated solution. It refers to how the census results are used rather than how they are produced. Normally a statistical agency would not have views on the manner in which its results are applied in the political arena. But when statistical methodology is put at risk, the agency can justifiably be proactive in taking steps to lessen the risk.

Perhaps consideration can be given to using statistical results in a less deterministic manner. By deterministic we mean the application of results in a manner that precludes a role for partisan give and take. In the instance of the U. S. decennial, the census counts are entered into a formula that determines how many congressional seats are allocated to each of the 50 states. In a more complicated process these census counts are used, at the state level, to draw the boundaries for a number of different election units including congressional districts and many state and local districts.

With respect especially to the first of these tasks - allocating seats to the 50 states - the process is highly determined. Once the count is in, there is no opportunity for partisan differences to be taken into account. But the law could be changed in this regard. For example, census counts are also used in federal funding formula. These formula often incorporate smoothing functions and hold harmless provisions that lessens the disruption that otherwise would occur as population numbers shift from one census to the next. Something similar could be considered for allocating congressional seats (or drawing district boundaries, though this is more complicated than can be considered here). For example, irrespective of the magnitude of population shifts from one census to the next there could be a smoothing function that prevented a state losing or gaining more than a few congressional seats after any given decennial. Or, there could be a hold harmless decision rule that takes into account the prevailing patterns of party strength.

This shifts the partisan battles to the decision rules. Such an arrangement gives legitimate scope to the play of partisan interests in the final determination of distributional shares, and assigns to the statistical agency its proper responsibility of providing the most accurate counts it can.

It is not my intention to recommend such a change in law, which has implications not here considered for the 'one person, one vote' provision in American electoral politics. Here I simply illustrate that the situation in which the U. S. decennial census finds itself is damaging to best statistical practice. It makes it difficult and perhaps impossible to realise the basic principle that the statistical agency should be concerned only with the quality and accuracy of its counts and that partisan interests should accept those counts, and argue only about how they are to be used.

References

Alonso W. and Starr P. (eds.) (1987) The Politics of Numbers, New York: Russell Sage Foundation.

Anderson M. (1988) The American Census: A Social History, New Haven CT: Yale University Press.

Anderson M. and Feinberg S. (1999) Who Counts? New York: Russell Sage Foundation.

Martin M. E. and Straf M. L.(eds.) (1992) Principles and Practices for Federal Statistical Agency. Washington D.C.: National Academy Press.

Prewitt K. (1987) 'Public Statistics and Democratic Politics'. In: Alonso W, Starr P. (eds.) The Politics of Numbers, New York: Russell Sage Foundation.

Seltzer W. (1998) Population Statistics, the Holocaust, and the Nuremberg Trials. Population and Development Review. 24 (3): 511-522.

Seltzer W. (1994) Politics and Statistics: Independence, Dependence, or Interaction? United Nations Department for Economic and Social Information and Policy Analysis, Working Paper Series No. 6. New York, NY.

Statistics, A Matter of Trust: A Consultation Document. (2000) Publication of the British Government by The Stationery Office Limited, London.

Kenneth Prewitt
Former Director of the US Bureau of the Census
Dean of New School University's Graduate School of Political and
Social Science in New York, USA.
Prewitt@newschool.edu

NB: This paper was first presented at the Royal Statistical Society annual conference which was held in Reading, 2000.