Background

Background and the dataset

The Canadian government imposed a head tax on Chinese immigrants entering Canada between 1885 and 1923 to restrict immigration. While a print register was created to keep track of the influx of migrants, the detailed recording resulted in years of demographic information about the immigrants that has become a rich source of data for researchers. For more details about the historical background and the dataset, please refer to Hacking the Historical Data: Register of Chinese Immigrants to Canada, 1886-1949, an OSF project.

In the spreadsheet, each row corresponds to an immigrant. Variables (columns) include demographic information such as name, age, height, gender, profession, etc., as well as destination and origin (mostly from a province in Southern China, namely Guangdong), which is broken down into two geographic levels: county and village.

There was a critical issue with the data, though: they were captured in idiosyncratic dialects of the immigrants, resulting in English variations of place names and titles. The inconsistencies, thanks to a project spearheaded by UBC Asian Library, were partly corrected: all the county names and a large portion of the village names were normalized.

Since the entire dataset contains over 90,000 records, I filtered the data to focus on one county, Zhongshan, in order to avoid making the visualization overcrowded or illegible. You can download the filtered the dataset of Zhongshan for the following steps.