Erica Groshen, former Commissioner of the Bureau of Labor Statistics, responds to five questions on the future of data accumulation in the U.S.

On Aug. 29, former Bureau of Labor Statistics Commissioner Erica Groshen held a seminar at the Upjohn Institute’s offices in Kalamazoo, Michigan. We sat down with Groshen the following day to get more detail on some of the concepts she mentioned during the seminar. The following transcript has been edited for length; a full transcript is available at



Q: Do you see any role for private third-party data supplementing or supplanting official government labor statistics? If so, what it would be?

EG: I would say a symbiotic relationship is where we are heading. Administrative data is very useful but it doesn't have some of the features of official statistics. To make the transition to a statistic you can use, you almost always have to use official data, which is based on a representative sample that has known properties. So there will be an increasing need for official statistics at the same time you have this explosion of other sources.

The other way the statistical system benefits from having all this third-party data is there's a lot of innovation the statistical agencies can learn from. Things like autocoding, small domain estimation, highly computationally intensive work, are things the statistical agencies have been able to adapt for use within the system.

One of the things you have to do with any statistic is figure out whether or not it makes sense in light of other things you think you know. So, that validation work is important.

Q: Response rates to surveys have been dropping, especially for household surveys. How did BLS address this decline?

EG: It’s always a three-legged pitch: it’s really important, it's safe, and we make it as easy as possible. The key thing to maintaining it these days is convincing people it’s safe and counteracting any aspersions it's not worth it because the data’s garbage or something.

To lower the burden, BLS has been offering many response modes to lots of its surveys: if you want to do it on paper, if you want to fax, if you want to call it in. Other things are to key off their own internal system as much as possible, so to work with intermediary companies such as payroll service providers and software providers.

Machine coding is a way of reducing burden on respondents as well. Instead of asking companies to give us your standard occupational codes, you can say to companies "just give us your occupational title" and BLS would use machine coding to convert that to SOC codes.

Agencies need to make sure they communicate well about why the respondent can trust the agencies to do the right thing with the data, to protect it and to turn it into info that’s important to the respondents. To the extent you can use administrative data you should, but there is just some information you’re not going to get if you don't ask people.

Q: There is concern that changing work relationships—for example, independent contractors, gig workers, app-based workers—are evolving faster than BLS can develop the tools to measure. How has BLS documented these forms of work?

EG: BLS has run a supplement to the current population survey, called the Survey of Contingent Work and Alternative Work Arrangements. And there have been two questions added: was this contingent work or alternative work? Was it mediated through an electronic platform, with matching, or was the work done electronically? But that (survey funding) is a one-shot deal, and BLS is going to need to have the resources to follow it.

The gaping hole is in understanding how employers see these alternative works and how they've changed their attitude toward how they obtain labor. One step toward doing that is changing the Quarterly Census of Employment and Wages Annual Refiling Survey to make it a random sample each quarter and then to add some questions about the employment practices of establishments.

We have a puzzle with the current population survey. There are a set of questions: Do you work part time? Do you hold multiple jobs? Are you self-employed? If we’re having this burgeoning of alternative and contingent work arrangements, you think that would see any one of these growing and you don't. Some of that work has been done right here at the Upjohn Institute by Susan Houseman. There is some evidence that people aren't thinking about work for which they receive 1099 forms as work. Cognitive research is needed to figure out how people are thinking about it so you can structure the questions properly.

Q: Federal statistical agencies have experienced multiple budget freezes and cuts recently, even as the demand for better data has grown. How are cuts decided?

EG: The basic thought process on eliminating a program at BLS is protection of the Principal Federal Economic Indicators designated by the Office of Management and Budget. BLS also protects programs that are written into law.

So, the places you look to cut are none of those. Unfortunately, those tend to be programs on which there’s a lot of research done: National Longitudinal Survey, the Job Openings and Labor Turnover Survey, the American Time Use Survey, and the Employer Benefits Survey.

BLS has proposed various times eliminating the National Longitudinal Survey, not because it’s not hugely valuable, but because, of this set, it's a fairly large program. But defenders of the NLS have been quite vocal and have helped to protect it, which is great, because it is a really valuable program. But it does put BLS in a bind. The (programs) that are most vulnerable are those four, except that their supporters are quite energized to protect them because they know that they’re close to the chopping block.

Then there's trying to shrink programs. There are two general ways to shrink a program. One is to cut the sample to make it smaller so then you lose detail and your standard errors rise. The fixed costs of running the program are still there. Another thing that’s been done is to cut the periodicity. The National Longitudinal Survey used to be annual, now it’s two years and there have been proposals to make it every three years. But both of these come at the cost of degrading the data.

Q: If these programs exist to help us make better decisions, what does cutting them mean?

EG: The elected officials are not running on platforms saying "I'm going to make sure you have access to the statistics you need to make good decisions." It’s a classic public goods problem. Everybody uses it, it would be undersupplied by the private market, but few people are ensuring that it continues to exist.