Link Search Menu Expand Document

Background and the dataset

The Canadian government imposed a head tax on Chinese immigrants entering Canada between 1885 and 1923 in order to restrict immigration. While a print register was created to keep track of the influx of migrants, the detailed recording resulted in years of demographic information about the immigrants that has become a rich source of data for researchers. For more details about the historical background and the dataset, please refer to Hacking the Historical Data: Register of Chinese Immigrants to Canada, 1886-1949, an OSF project.

In the spreadsheet, each row corresponds to an immigrant. Variables (columns) include: demographic information such as name, age, height, gender, profession, etc, destination, origin (mostly from a province in Southern China, namely Guangdong) which is broken down into two geographic levels: county and village.

There was a critical issue with the data though: they were captured in idiosyncratic dialects of the immigrants and resulted in English variations of place names and titles. The inconsistencies, thanks to a project spearheaded by UBC Asian Library, were partly corrected: all the county names and a large portion of village names were normalized.

Since the whole dataset has over 90,000 records, I filtered the data to one county, Zhongshan, in order to avoid making the visualization overcrowded or illegible. You can download the filtered the dataset of Zhongshan for the following steps.