Workshop

View the Project on GitHub MaxwellFonss/Workshop

Literature Review

During the People’s Republic of China’s transition to a first world country over the last few decades, human development has improved greatly as citizens gain access to both essential resources such better wages and education. This remarkable improvement in the average Chinese citizen’s quality of life can be illustrated by the average per capita GDP, which is projected to increase by one hundred percent from six thousand dollars to twelve thousand dollars a year in only a ten year period from 2012 – 2022. However, while the urbanized coastal cities closest to the global economy continue to grow, the stagnation of rural regions within inland provinces has resulted in a concerning amount of inequality. While China’s continued growth may still improve conventionally accepted statistics of human development such as investment, it fails to capture the continued struggles faced by billions. For instance, China still experiences 1.2 billion premature deaths each year countrywide despite the improving healthcare services in megacities such as Shanghai or Beijing. Additionally, while China attracts a substantial amount of foreign investment to create new industries, more than a third of its citizens still make their living performing traditional agriculture. Much of this regional inequality can be accredited to the inability to provide citizens living in rural communities’ access to adequate transportation, healthcare and financial services. While many previously developed countries have been able to redistribute opportunities to citizens living outside urban areas through either a federalized system of government or direct government aid, China’s centralized system of government in combination with the vastness of it’s territory and population has meant that previously proven methods of supporting rural communities have not been effective. Instead, the country faces the daunting task of urbanizing the largest population in human history.

Although the goal of urbanization may initially seem straightforward, there are many agents, existing social services, and complex interconnected networks that the Chinese central government faces in encouraging urbanization. Most of China’s inland provinces are made up of scattered small communities governed by village leaders who associate the existing rural lifestyle with their unique regional culture that is distinct from the largely ethnic Han dominated cities. Additionally, many farmers are resistant to diversifying because they lack money to take on the risks associated with investing in unfamiliar industries. This is compounded by the stance of China’s financial sector, which see it was a potential liability to give farmers who don’t have a credit history access to loans. The Chinese government must also contend with the complex network of migration when making decisions: any given decision could cause millions of citizens to migrate to other provinces, and each citizen will make a unique decision based on their wealth and education level. An example of this problem can be seen in a previous government program to offer farmers government subsidies for buying new land or investing in more profitable cash crops. Much of the land that was intended to benefit farmers was instead bought out by citizens by eastern provinces who used their wealth and education to further expand into the communities of those the program was intended to help. Directly investing inland may also negatively impact the growth of already large eastern cities, whose rural immigration to more efficient healthcare systems reduced total healthcare costs by an estimated 31 billion dollars last year. Evidently, migration is an emerging social and economic system that must be considered before any decisions are made by the government to encourage urbanization.

Given the scale and economic importance of China, there have been numerous studies that have analyzed aspects of urbanization such as market speculation, access to credit, land prices, and especially migration. Each study used a specific methodology to analyze their dependent variable, examples of which include autoregressive integrated moving average models, granger causality tests, exponential random graph models, random walk models, rank size models, and Bayesian networks.

Autoregressive integrated moving average models are designed to forecast the future change of a specific variable. In these models, it is assumed that the variable being analyzed is primarily influenced by the trend of its prior values over time. Therefore, scientists can use a set of timeseries data of the variable from the past to predict future change and. Since this method can accurately model the growth of important economic statistics but requires large amounts of data, it is used extensively by econometric studies.

Granger causality tests are a variation of traditional hypothesis tests that can determine if one variable likely caused changes in another variable through the analysis of timeseries data. Although granger causality tests are liked by statisticians for their simplicity, researchers must be careful not to incorrectly associate a supposed causality with datasets that show an arbitrary relationship out of pure coincidence. Due to this method’s versality and potential for ‘false positive’ conclusions of causality, there has been debate among scholars if it is bad practice to use granger causality tests outside of STEM related fields of study.

Exponential random graph models are a family of methods used to analyze a network. These take the form of mathematical distributions that can predict a desired variable by looking at the relationship between the network’s nodes and connections. Due to the complex nature of a network, exponential random graph models are especially useful since they can help a researcher find ‘almost anything’ given enough data. For instance, one can decide if they want to find the amount of connections shared by any number of nodes or find the relative popularity of any vertex in a network among its nodes. The large scale of exponential random graph models makes them popular in sociology studies.

Random walk models are simulations that model an object taking a set number of completely random actions. Since almost all aspects of life are at least somewhat inherently random, the method can be used to model almost anything across any number of dimensions, be that a one-dimensional number line or a three-dimensional landscape. Given that random walk models are an integral part of probability, they are used in almost any field of study from game theory to molecular biology.

Rank size models are visual representations of data that graph datapoints based on their relative relationships. Generally, this means graphing the magnitude of the statistic on the y axis while graphing the relative ranking of the datapoint on the x axis. This method is especially useful for analyzing datasets where there are a few very large samples, and many smaller samples. Examples include people’s heights, city populations, or the gross domestic product of countries.

Bayesian networks are networks of conditions that are linked by the causal relationship shared by each node. For instance, a Bayesian network about public health could show that obesity and genetics may cause diabetes, and that diabetes may also contribute to obesity. However, the network would still show that it has not been proven that type 1 diabetes directly can influence genetics through inheritance. Bayesian networks are similar in utility to granger causality tests, but they can be used on a much wider scale both in terms of number of variables and size. In today’s modern world of big data, machine learning algorithms use Bayesian networks frequently to analyze entire populations. Due to these developments, methods such as Bayesian networks have laid the foundations for interdisciplinary fields between computer science and the humanities.

In contrast to other developing nations, studies analyzing China have access to significant amounts of data due to the country’s high technological development. This gives researchers the luxury of taking a ‘top down’ approach, which allows studies to answer more targeted questions than ‘bottom up’ ones. For instance, while many studies of sub Saharan African nations attempt to monitor the general growth of economic activity through electronic records of consumer loans, researchers in China can predict real estate prices in specific provinces by analyzing hedging activities in the financial market. Some of these more detailed datasets include cellphone-based location data, government records of interprovincial migrant populations, census data, and survey results.

Like the United States, the Chinese internet services industry is dominated by a handful of multinational corporations. This technological ecosystem is especially useful to researchers since any dataset provided by an internet company will be comprehensive enough to be representative of the entire population. In China, cell service is provided by the telecommunications giant Tencent, who collects location data from its users and allows it to be accessed for research purposes. In Spatial patterns and determinant factors of population flow networks in China: Analysis on Tencent Location Big Data, this data was processed through an exponential graph model to identify which factors made cities attractive to move to.

Comparable to the American census, the Hukou is a household registration system that has been used to manage China’s large population. While the system has faced western criticism for its role in further restricting the social mobility of China’s rural population, it is nonetheless a useful source of information for sociological studies. In Geographical transformations of urban sprawl: Exploring heterogeneity across cities in China 1992 – 2015, Hukou data was used by a regression model to determine the how a lack of urban planning affected the population distribution of rural areas.

While the population ranking of Chinese cities may seem simplistic, they are necessary for many studies in order to graph rank size models. All the papers that analyzed migration found population rankings necessary to convey to the reader the general magnitude of population centers through descriptive statistics made possible by rank size models.

Although the predictive models created by studies that used quantitative researchers are incredibly important to understand the macro environment of Chinese urbanization, Diversification and Agrarian Change under Environmental Constraints in Rural China: Evidence from a Poor Township of Beijing Municipality took a more personalized approach to understanding the individual citizen through survey data. The paper survey questions focused on understanding farmers’ demand and ability to obtain credit. This was done by asking over fifty questions, some of which included asking farmers if they would take loans of varying interest rates as well as different payment plans. According to the Lucas critique, integrating microeconomic data from individuals is essential to creating accurate macroeconomic models. This is because aggregated macroeconomic data cannot consider how individuals will react to changes in their environment. The Lucas critique is especially important in a country whose market conditions are as regulated as China, since the incentives of individuals are distorted to the point where economists can no longer make the free market assumptions needed to create their traditional models. Despite these obstacles, the previously mentioned paper was still able to accurately model a Chinese farmer’s potential social mobility through their access and demand for credit through survey data.

Although the ten articles featured in this literature review have looked at the urbanization of China through economic, sociological, and even spatial perspectives, it still may be necessary to find articles that look at the traditional raster data featured frequently throughout this course. Investigating the share of land rural communities have access to on a more visual level could be very useful in gaining a better understanding of the challenges facing Chinese farmers, as well as provide useful visual aids in a research proposal. However, the methods that were outside the scope of data 150 were still incredibly informative and allowed me to assess the environment from a unique perspective.

Even while investigating an information rich country such as China, there still was a clear research gap in demographically disaggregated data. All the sources featured in this literature review split the Chinese population into urban and rural groups. While this categorization is relevant to my desired area of study and easy to identify based on location, it fails to address important distinctions such as gender. Most papers analyzing urbanization focus on economic development, which is the main contributor to human development. However, these papers left me curious how health services such as access to birth control were affected. Additionally, these papers did not consider if the transition from rural to urban life improved the gap in employment between men and women. Since my area of study is complex and focused on development, I decided that my research question should be evaluative. My current draft is the following: To what degree have China’s rural population benefitted from the recent growth of cities?

Bibliography

Zheng, Siqi. and Khan, Matthew, China’s Bullet Trains Facilitate Market Integration and Mitigate the Cost of Megacity Growth (March 18, 2013). https://www.pnas.org/content/pnas/110/14/E1248.full.pdf

Démurger, Sylvie and Fournier, Martin and Yang, Weiyong, Diversification and Agrarian Change under Environmental Constraints in Rural China: Evidence from a Poor Township of Beijing Municipality (April 1, 2007). http://dx.doi.org/10.2139/ssrn.988057

Turvey, Calum G. and Kong, Rong, Farmers’ Willingness to Purchase Weather Insurance in Rural China (May 6, 2010). https://ssrn.com/abstract=1601625

Aunan, Kristin, Shuxiao, Wang, Internal Migration and Urbanization in China: Impacts on Population Exposure and to Household Air Pollution (2016). https://www.sciencedirect.com/science/article/pii/S0048969714002472

Gaughan, A., Stevens, F., Huang, Z. et al. Spatiotemporal patterns of population in mainland China, 1990 to 2010. Sci Data 3, 160005 (2016). https://doi.org/10.1038/sdata.2016.5

Chu, Qingqiang. And Sing, Tien, Inflation Hedging Characteristics of the Chinese Real Estate Market (2004). https://papers.ssrn.com/sol3/papers.cfm?abstract_id=538742

Deng, Yu and Qi, Wei, and Fu, Bojie, and Wang, Kevin, Geographical transformations of urban sprawl: Exploring the spatial heterogeneity across cities in China 1999-2015 (2015). https://www.sciencedirect.com/science/article/pii/S0264275119300307

Zhang, Weili and Chong, Zhaohui and Li, Xiaojian and Nie, Guibo, Spatial patterns and determinant factors of population flow networks in China: Analysis on Tencent Location Big Data (2019). https://www.sciencedirect.com/science/article/pii/S0264275119311862

Pan, Jinghu and Lai, Jianbo, Spatial pattern of population mobility across cities in China: Case study of the National Day plus Mid-Autumn Festival based on Tencent migration data. (2019). https://www.sciencedirect.com/science/article/pii/S0264275118311703?via%3Dihub

Flowminder, Mapping Indicators of Women’s Welfare at High Spatial Resolution. https://web.flowminder.org/case-studies/mapping-indicators-of-womens-welfare-at-high-spatial-resolution