The U.S. census and the data it produces

Every ten years, the United States conducts a census – that is, counts the population living in the United States and records information about them. This has been done every decade since 1790. This count or enumeration of the population serves as the basis for apportioning, among other things, the number of representatives each state gets in the House of Representatives. The Constitution decrees that the census must be conducted every ten years. The next census will be in 2020.

The census is conducted by household, and records every person living within the household and what their relationship to the head of household is (spouse, child, lodger, etc.) The census has asked different questions in different years. In 1920, the standard fields included, among other things, mother tongue, race, place of birth, whether the person was a U.S. citizen, whether the person could read or write, and what their occupation was.

The information recorded can be used for all kinds of analyses, and is a great source of information about ordinary people – especially in the aggregate. What occupations were common in a particular place? Were immigrants from a particular country particularly likely to have a particular occupation? How many households had lodgers? Did urban or rural families have a larger number of children? Who was literate, who was not? And so on.

Back in the day, the census was done entirely in longhand, so the 1920 census schedules look like this:

1920 census sheet with longhand writing

This would of course be much too unwieldy for any kind of data analysis. Happily, the data is available in a csv (comma separated values) format, which can then be opened and analyzed in, for example, a spreadsheet program like Microsoft Excel.

Census data is easy to get: IPUMS (Integrated Public Use Microdata Series) provides full count data or samples from 15 U.S. censuses.[1] On the IPUMS site, you can create a custom query as well as find the documentation for the data they provide.

In these pages, you’ll find some analyses we have created from this data for Colorado counties (currently Weld County).

[1] Steven Ruggles, Sarah Flood, Ronald Goeken, Josiah Grover, Erin Meyer, Jose Pacas, and Matthew Sobek. IPUMS USA: Version 8.0 [dataset]. Minneapolis, MN: IPUMS, 2018. https://doi.org/10.18128/D010.V8.0