Workshop

View the Project on GitHub MaxwellFonss/Workshop

In the reading today (Stevens et al.) the authors use a technique to produce a high-resolution description of the distribution of human populations across the globe. What is the name of the technique and describe in general and basic terms how it works?

To map human population distribution, Stevens uses a technique called ‘random forest’. This is a machine learning algorithm that simulates the distribution of a single dependent variable using multiple independent variables. This is done by taking the mapped gridcells of every covariate to create a predicted distribution of the dependent variable, which in Steven’s study is population density.

The random forest method used by the authors is a machine learning algorithm (ensemble method). In general terms, what is a machine learning algorithm? Within the context of this study what distinguishes a data science, machine learning method (such as random forest) from previous classical statistical approaches to describing and analyzing phenomenon and events?

A machine learning algorithm is a method in which a researcher uses the computing power of a machine to identify trends within a very large set of data. Using these trends, the researcher can do a wide variety of actions from identifying causes and effects to creating entire models using data obtained from regression. In this particular study, Stevens distinguishes his work from classical statistical approaches by creating semi-randomized models of populations that are unique, rather than producing a model that would be the same as long as the same input data was given.

The authors’ results present a remarkable improvement over previous geospatial descriptions at very high resolution, of the distribution of the human population. Within the context of human development in LMICs, what is the significance of having a highly accurate description of where each person is located across planet earth?

In the context of the modern world, the global community has the resources necessary to aid populations in developing countries but lacks the information to target those in need efficiently. Therefore, information has become an extremely valuable resource. With an accurate description of population distributions, aid can become both transparent and efficient.

Within the context of human development in LMICs, what is the relevance to your area of investigation in having a highly accurate description of where each household and person is located across planet earth?

In terms of my own study of urbanization within rural China, information such as migration and location of households is incredibly important since both are indicators of both wealth, public infrastructure, and technology. Generally, China is more developed than Nigeria, so studies typically use more targeted methods to identify populations such as cell phone data.