Feeling the Elephant, Coming to Blows

Economists since Adam Smith have asserted that competition among schools is good for everyone but teachers — and probably good for teachers, too, since it broadens the market for their services.

And almost everybody has a favorite example of effective competitive policy in education.

Two of the most successful in the United States:

  • The Morrill Act of 1862 (or, more properly, the Lands for Colleges of Agriculture and Mechanic Arts Act), which broke the power of the old universities of the East by creating well over 150 public land-grant colleges and universities in midwestern and western states and territories (the sixteen southern states had to wait until 1890 for their turn);
  • The GI Bill of Rights (officially the  Servicemen’s Readjustment Act of 1944), which over 7 years distributed vouchers to some 8 million veterans, enabling them to attend whatever educational institution was willing to admit them.

But when it comes to the long-standing commitment of the industrial democracies to universal public education — to the finer points of primary and secondary education — it becomes very tricky to pin down what kind of choice works best. After all, college students are free to move. Children are raised in families, and families live in particular places.

Charter schools, voucher plans, church-run schools and private schools, the subtle competition among school districts and property values, mostly in the suburbs, that economists call Tiebout-choice (after the man who first analyzed it), and endless varieties of particular reforms within school organizations, everything from uniforms and standardized testing to decentralization, all have their enthusiasts.

How to judge among competing policies? One device is to look for big public policy experiments that may produce striking results — vouchers in Milwaukee, charter schools in Michigan and Arizona.  Another is to look for natural experiments that history and nature may have set up.

That was economist Caroline Hoxby’s big idea when, in 1994, as a young assistant professor at Harvard, just out of graduate school at the Massachusetts Institute of Technology, she settled on streams as an instrument for measuring the degree of competition among public school districts within metropolitan areas. Here were boundaries that were simply given by nature, not created by people pursuing their own advantage.  Streams led to fragmented districts in some places and to relatively unified jurisdictions in others, providing an exogenous baseline against which to measure the rest.

That was before she learned that early American laws (and even the medieval philosopher Maimonides’ Rule) held that students ought not to have to cross streams in order to receive instruction. She had simply noticed that school district boundaries frequently were streams.

The idea was that differences in the ease with which parents could historically choose among city districts — the voting-with-feet process of so-called Tiebout choice that is so pronounced among suburbs first in the railroad epoch, then in the automotive age — should have something to say about the debates of early the 1990s over school vouchers.

Did inequality increase as rich people crowded poor people out of the best schools? Did competition encourage higher productivity among all schools (as measured by the ratios of, say, budgets to test scores)? Did vigorous competition among public schools reduce the number of private schools, parochial schools in particular?

So Hoxby spent months laying a ruler on topographical maps, measuring streams, eliminating big rivers, seeking to distinguish between center cities and residential neighborhoods, creating a measure of school concentration across the nation that, when matched with expenditures and compared with measures of student achievement, would allow her to discriminate among competing claims for choice.

She concluded that, in general, things worked pretty much as advocates of voucher proposals had predicted. Sorting between rich and poor increased, but there was no evidence that disadvantaged groups fared more poorly as a result. Areas with more choice had fewer inputs — fewer teachers, larger classes — but better than average student performance, as measured by students’ educational attainment, wages and test scores.

Six years later, Hoxby’s investigation was finally published in the American Economic Review as “Does Competition among Public Schools Benefit Students and Taxpayers?”  It quickly became a famous paper, admired as much for its ingenuity in adopting topography as a baseline as for its technical sophistication, classified as being generally conservative in its orientation.

Outside economics, the applied economist’s arguments about data and econometrics probably had very little direct effect. Inside the profession, however, she became an increasingly senior figure, with something interesting to say about nearly every significant issue in the economics of education, and a powerful advocate for school choice, especially the entrepreneurial start-ups known as charter schools. Granted tenure by Harvard, she was appointed director of the program of education economics at the National Bureau of Economic Research, and became a fellow of the Hoover Institution at Stanford.

After her remarks before the faculty last spring upbraiding Harvard University president Lawrence Summers for incivility — a moment whose solemnity some participants compared to the occasion fifty years before when lawyer Joseph Welch challenged Sen. Joseph McCarthy during Senate hearings — Hoxby at 39 became an academic star of the first magnitude. She is weighing competing complicated offers from Harvard and Stanford to spend the rest of her career at one place or the other.

Last spring, however, Hoxby’s original methodology on the streams paper was challenged by Princeton University economist Jesse Rothstein, 31, fresh out of  the a University of California at Berkeley and just starting his career. He had discovered “several important errors” in Hoxby’s data and code, Rothstein wrote

Moreover, he asserted, her entire exercise had turned out to depend on the way that “large streams” were defined. Replacing her streams with simpler measures that he described as being equivalent, he found the effect of competition among districts dropped to near zero. When he was done, he said, there was little evidence left in Hoxby’s paper that competition among schools raised their productivity.

Most darkly, Rothstein wrote that “despite repeated requests over several years,” Hoxby had not provided him with the data that she used to generate her published paper. Eventually, he asserted, she had provided him with a corrected data set. The new data generated different results, said Rothstein. In order to replicate the original analysis, he sought to recreate the original data set by hand.  When he did, he obtained “somewhat weaker” results.

(Hoxby says she didn’t correct anything; that she used data that the National Center of Educational Statistics had corrected itself; and that she took extraordinary pains to obtain its release it from confidentiality requirements imposed by the NCES, to annotate its coding and make it available on a CD. “I released not only the data, but every bit of my raw data and code, which is more than anyone else had ever done in microeconomics.”)

Rothstein sent in “A Comment on Hoxby” to the National Bureau of Economic Research last December, and at the same time to the American Economic Review. Its 69 pages of dense econometrics didn’t appear on the NBER Website until March — long enough for Hoxby to have prepared a fiery 33-page reply (“I discuss every claim of any importance in the comments. I show that every claim is wrong”).  The two working papers were released simultaneously in March. .

Then the argument was buttoned up. Instead of eliciting (or permitting) an immediate rejoinder from Rothstein, AER editor Robert Moffitt of Johns Hopkins University, himself an expert in applied economics, decided to work out the differences between the two analyses behind the scenes, with a view to eventually publishing in his journal what could be substantiated in each note — and presumably leaving all the rest on the cutting room floor.

A period of laborious negotiation thus began, of passing drafts back and forth among various experts chosen by the editor to serve as anonymous referees, and between the parties themselves — completely obscuring the controversy for the time being.

Meanwhile, two news stories about the fracas have appeared, first in the Harvard Crimson last summer, then on page one of The Wall Street Journal last week (subscription required). “Making Waves,” was the Journal‘s headline: “Novel Way to Assess School Competition Stirs Academic Row/To Do So, Harvard Economist Counts Streams in Cities: A Princetonian Takes Issue/ Charges and Countercharges.” Jon Hilsenrath is the Journal’s excellent economics reporter; his unassailable conclusion: “Despite a vast array of statistical tools, economists have had a very hard time coming up with clear answers.”

Certainly the drama is very great. An eminent figure in economics poised on the brink of great influence, a youthful whistleblower just starting out on his career, each supported by a coterie of loyal seconds, separated by only a few years of age but miles apart in their political convictions. (Hoxby graduated from Harvard College in 1988, Rothstein in 1995. Rothstein worked for a year and a half at the organized-labor-friendly Economic Policy Institute in Washington before heading off to graduate school.)

And certainly the atmosphere is thundery enough. Though down from the fever peak of a few years back, charges of professorial misconduct still make news. Twice last week the Boston Globe front-paged a story about an MIT immunologist who had been fired for fabricating data.  

But the controversy has less than nothing to tell about the mechanisms of school choice. There is much to learn, however, about the opposition between the culture of replication in economics and the tradition of original work.

(A second, even more famous paper was challenged last year by a Berkeley graduate student. Last spring MIT economist Daron Acemoglu won the John Bates Clark medal, awarded every two years to the American economist judged to have made the greatest contribution to the field before the age of 40. Central to the prize was his 2001 paper, “The Colonial Origins of Comparative Development: An Empirical Investigation,” joint with Simon Johnson and James Robinson, which argued that the presence or absence of  property rights accounted for different paths of development.

(A few months earlier, Berkeley’s David Albouy had published an account of his attempt to replicate Acemoglu’s results.  He concluded that a key data series in the original paper suffered from “inconsistencies, questionable judgments and errors.” Acemoglu and his co-authors replied in an extensive note that that there was “no foundation” to any of the criticisms raised by Albouy. “At best they reflect a long list of mistakes on his part in coding and selecting data.” And there the matter seemed to rest.)

Is Caroline Hoxby a model investigator or a secret artist of the thumb on the scale? Is Jesse Rothstein a forensic economist or a suicide bomber? Informed discussion will have to wait until the editors of the American Economic Review have done their work, probably some time early next year. Even then, a new round of competing claims will be required to bring the issue into focus, to decide who got the better of the argument. (An extensive summary and a pretty persuasive preliminary judgment by a knowledgeable if anonymous economist may be found at The Lowest Deep.)

It’s a good thing, therefore, to remember that Adam Smith himself soon found he had plenty of competition in seeking truth — from political theorists, sociologists, psychologists, anthropologists, evolutionary biologists and, in the 20th century, the myriad management scholars who work in schools of business, government and education, who are among the best-informed experts of all. And of course the most significant developments at every turn probably have come from practitioners themselves — in this case, educators, reformers and politicians.

A case in point: instead of seeking illumination in the argument about the degree to which the estimated variance-covariance matrix for the instrumental variables estimator is robust to arbitrary forms of intra-cluster correction and cluster-based heteroskedasticity, consider William Ouchi, who is intent on approaching the problem from the other end of the stick.

He thinks the right answer is to empower school principals to customize their schools.

Ouchi, a UCLA business professor, has a PhD from the University of Chicago, He has spent 35 years studying the design and structure of very large business organizations, and until recently, he was best-known for his 1981 best-seller Theory Z: How American Management Can Meet the Japanese Challenge, which counseled confidence at a time when it was widely expected that Japan might soon surpass the United States in industrial leadership.

But for most of the last decade, Ouchi has devoted himself to the cause of school reform.  To the extent his analysis is grounded in technical economics, it flows from the insight, formulated in 1975 by University of California at Berkeley economist Oliver Williamson as “the M-form hypothesis,” that “with large size comes administrative bulk-up along with many negative consequences and that the chief antidote is to decentralize decision-making down to the operating sub-units.”

In corporations, this means multidivisional management, with central offices providing services such as payroll and insurance in which economies of scale are pronounced, but otherwise leaving to its operating divisions most choices about how to proceed. In schools, according to Ouchi, it means pushing decision-making authority down to the level of individual schools while building up mechanisms to assure accountability. Since 1932, the number of students enrolled in public schools in the United States has doubled, from 24 million to 50 million, according to Ouchi. In the same period, the number of school districts has declined by a factor of ten — from around 127,000 to around 16,000.

“Few business organizations that live in a competitive world could survive that much growth without fundamentally altering their organizational form through decentralization. But school districts, not living in a competitive world, have not changed their form. They remain every bit as centralized as when they were one-fifteenth their present size.”

Ouchi got involved when his friend venture capitalist Richard Riordan (later to become mayor of Los Angeles) formed an organization with the late Helen Bernstein, then president of the teachers’ union of Los Angeles, and Robert Wyckoff, president of Arco Oil, to study the ills of California’s giant school system (6 million students, 12.5 percent of the nation’s total).

The problem, they concluded, was not that the students were immigrants or that teachers were bad or, least of all, that the schools didn’t have enough money. It was that individual schools, serving unique groups of families, lacked the autonomy to come up with their own instructional approach. Principals were articulate bureaucrats instead of leaders.

Ouchi designed a study, raised a million dollars, hired a group of researchers and set out to visit 223 schools in the US and Canada, the three largest school districts in the US among them — New York, Chicago and Los Angeles. The results are reported in Making Schools Work: A Revolutionary Plan to Get Your Children the Education They Need.

There are a few tables in Ouchi’s book, and no equations.  Instead there are plenty of point-making anecdotes. One concerns the morning when Ouchi called the headquarters of the Archdiocese of New York in preparation for his visit.  How many central office staff did it have for its 120,000 students? He called well ahead on the assumption that it would take days to assemble the data.

“Do you really need to know that?” asked the woman who answered the phone. Ouchi assured her that it was an important part of his study. “Well, just a moment, I’ll go count them,” she replied, returning a few moments later to announce the results of her census. There were 22 workers, including secretaries and clerks.

In contrast, New York City had 3,000 headquarters staff for its 1.2 million public school students — plus another 22,500 workers reporting to them distributed around the city — 25,500 central office workers instead of the 220 if strict proportionality to the parochial schools’ organization were to be observed. “They work under the orders of someone in the central office, and they do what the central office wants done,” says Ouchi, “not what the principal wants done.”

In the face of such beguiling arguments, it is tempting to dismiss econometrics altogether.  But you have only to remember the story of, say, MIT’s Theodore Postol and the Patriot missiles that were fired during the 1990 Gulf War, to realize how important statistical measurement can be in the face of what otherwise seems incontrovertible evidence. Applied microeconomics affords an indispensable foundation for policy debates. But we must go on feeling the elephant with all the instruments of our extended senses at our command.