What we did with the census data

[Contributor: Jesse White]

The beginning of all our analysis is the raw census data, which we downloaded from IPUMS (see the page on the U.S. census.) The original census data from 1920 of Weld County Colorado contains roughly 52,000 entries of data, with numerous columns of information for each household.

We opened the census data in Microsoft Excel, which made counting, adding, and separating data easy, so we could identify and visualize differences and trends within the data. Just scrolling through certain columns can result in speculative ideas. Some odd trends may stick out and persuade pursuing in order to understand the reason behind a certain trend. Individual columns of data may also seem to have a correlation with others.

Screen shot showing census data columns in Excel

Since we are interested in immigration, two columns that jump out as intriguing are the country of birth (bpl) and the year that individual immigrated (yrimmig). For example, we can easily make a pivot chart in Excel that shows the number of immigrants who immigrated in a certain year throughout the entire data. This is a useful visualization which demonstrated the years when he bulk of the immigrants in Weld County came to the United States.

Pivot chart of year of immigration showing a bar graph

A pivot chart simply counts the frequency of each of the values of a particular variable (in this case, year of immigration) and plots it in a graph. (Find out more about pivot charts and how to make them!)

This data is useful in a very broad perspective, but now that this data shows a trend, it hints that more localized sets of data may also show trends. This drives the narrowing of the data into specific countries to see when people who originated from the same place chose to leave their home countries.

It’s worth emphasizing that in looking at charts like this, you have to pay attention to what the chart shows. For example, remember that this is not a chart of all immigration into the U.S. in these years. Nor is it a chart of immigrants in Weld County in those years. Instead, this chart answers the question, “When had the  immigrants who lived in Weld County in 1920 first entered the United States?”

By sorting the data by country, we can dig into country-specific trends.

For example, here’s a screen shot of graphing when in Weld County who originated in Russia (which be came the Soviet Union in 1917) arrived in the United States.

Russia immigration line graph

Since it’s rather hard to compare the two charts separately, we can also plot the year of immigration of those born in Russia against the year of immigration of all the foreign-born population in Weld County:

Year of immigration – All foreign-born population in Weld County vs. population born in Russia/USSR

As we can see in this chart, the population born in Russia generally immigrated later than the overall foreign-born population. We can also see that in the 1910s they account for a very large percentage of arrivals in those years. Again, remember that this is not a chart of immigrants in Weld County in those years, nor of all U.S. immigration, nor of arrivals in Weld County in those years. Still, it’s interesting to note that of those in Weld County, a very large percentage of recent arrivals were born in Russia. Indeed, immigrants from Russia in a single year (1913) accounted for over 1% of the entire Weld County population in 1920.

We might want to ask whether other immigrant populations show trends in their arrival years, too, and how those trends stack up against immigrants from Russia. Here’s a graph comparing those born in Germany, Japan, and Russia (note that since there are so many more immigrants from Russia, they are plotted on the secondary graph on the right Y-axis – in other words, the scale is different but we can see the over-time trends in all three populations and compare them against each other):

Line graph, showing immigration from Germany peaking first, then immigration from Japan, then immigration from Russia

In this chart we can see that immigration from Germany peaks earliest and is significant from 1880 onward, while immigration from Japan has a noticeable peak around 1908 and again around 1912. The earlier year, especially, matches a peak in Japanese immigration to the U.S. overall, as the knowledge of the imminent so-called Gentlemen’s Agreement between the U.S. and Japan curbing Japanese immigration spurred a wave of migration in 1907 and 1908.[1]

Besides comparing trends, we might also dig deeper into the main populations. What can we know from the data about the Russian immigrants, from instance?

Poking around in the data, one quickly notices that many of the immigrants from Russia list German rather than Russian as their mother tongue. How common was this?

Another pivot table quickly answers that question:

pivot table showing mother tongues of immigrants born in Russia: German is 3,217, Russian 387, English 86, and there is a smattering of "Jewish", Hebrew, and Yiddish, as well as some others.

We could of course draw a nice graph of this, but even the table clearly shows us that the German-speakers far outweigh any other immigrant group from Russia, accounting for over 85 percent of all the people in Weld County from Russia. Russian-speakers account for only just over 10 percent of those born in Russia in Weld County. Digging into this even a little bit, we quickly find that there was a large German migration to Russia during the reign of Katherine the Great, and that many of the descendants of these so-called Volga Germans left Russia for the United States in the early twentieth century.

As can be seen from the above basic analyses, it is fairly easy to look for patterns in the data. The data cannot really answer any “why” questions, so research into other sources is needed – but it can point to interesting questions to find out more about.

[1] Wei, William. Asians in Colorado: A History of Persecution and Perseverance in the Centennial State. Seattle: University of Washington Press, 2016.