MIGNEX Insight
Counting households using satellite maps – does it work?
Faced with limited population data in many of our research areas, MIGNEX is using innovative sampling approaches. One component in our sampling strategy is to estimate the number of households by counting dwellings visible on satellite maps. We tested this approach as part of our pilot in São Vicente.
The MIGNEX survey involves many important aspects – a critical one is sampling. In the MIGNEX survey, we have a uniform sample size of 500 respondents in each of our 25 research areas. For the sample to be representative of the target population in each research area, we need to have a sampling frame. In other words, we need to know how many households live in the different clusters of each research area, before dividing up the sample across clusters. However, this kind of data is not available or out of date in many of our research areas. Using outdated or unreliable population lists could give us an inaccurate picture of our target population. This is a particular challenge in areas with informal settlements and high rates of population change, such as high migration rates.
Increasingly, researchers have been making use of advances in GIS technology to draw representative samples for household surveys in the absence of up-to-date and reliable population lists. There are a number of approaches in the emerging literature that use geospatial information to draw representative samples in less developed contexts; for instance Driscoll and Lidow use grid squares overlain on satellite maps in Mogadishu. Cajka et al. use population density datasets to create sampling frames in 11 less developed and middle-income countries, and Eckman and Himelein developed an algorithm to generate population estimates from satellite maps in DRC.
We decided to test such an approach as part of our survey pilot in São Vicente in Cabo Verde. The island has seen lots of urban growth and demographic change since the 2010 census, the last available source on population data. Given that our sampling strategy will be applied across 25 diverse contexts, we focused on developing a simple process that can be easily replicated by all partners involved.
Counting households from 4,000 km away
Our sampling strategy has three stages. In the first stage we sample clusters, in the second we sample households and in the third we sample respondents within a household. And for the second and third stages, we need fairly accurate estimates of the numbers of households in each cluster.
So how did we go about counting households? We employed a desk-based approach for our initial counts. First, we divided São Vicente into smaller clusters based on recent satellite imagery, following natural features visible on the satellite map including roads, rivers and residential blocks so that they would make sense to enumerators in the field. In order to make household estimates for each area, we drew building outlines on top of a satellite map, using Open Street Map.
Then we marked which buildings appeared to not be residential, cross-referencing with Google Maps – these are not included in the household count. We also marked buildings that appeared to be multi-household, for instance apartment blocks, using the size of shadows in satellite maps as clues. São Vicente was a good case study for testing our approach, as it includes urban areas in the city of Mindelo, as well as sprawl into the mountains around the city, which included many informal and recently erected settlements.
Next, cluster by cluster, we counted the residential buildings that we assumed to be single-household dwellings. For buildings that we assumed to be multi-household dwellings, we made assumptions about how many households might live in the dwelling. Combining counts of dwellings (together with adjustments for multi-household dwellings) gave us estimates for the number of households in each cluster.
The verification exercise
To test how accurate our desk-based household estimates worked, during the survey pilot in São Vicente in February 2020, we counted the number of households for two small areas on the ground. In essence, this was a simple exercise of walking along the streets of each selected area, looking at each dwelling, counting the number of households that appeared to live in each dwelling and then making a running tally of households in the segment. Using a highlighter on the printed satellite map, we marked which dwellings we had already counted.
From the ground, we clearly had access to information and clues not visible on satellite maps. For example, smaller businesses, like mechanical repair shops, bakeries or small grocery stores, part of, or surrounded by residential dwellings were not visible on satellite maps but stuck out clearly from the street. It was also possible to spot abandoned dwellings. Using information like the number of balconies, doors or doorbells, we were able to ascertain how many households actually lived in multi-household dwellings.
Jessica conducting the verification exercise. Video: Jørgen Carling for MIGNEX
Does this method work?
The results of the verification exercise were promising. In the Fonte Francês neighbourhood, most dwellings are occupied by a single household. The satellite maps also showed smaller and more irregular dwellings, and we estimated 129 dwellings based on this. We counted 119 dwellings on the ground – hence our desk-based exercise had a margin of error of only 8%.
For the inner-city Monte neighbourhood, the margin of error was higher at 26%. Some of this was due to businesses not visible on the satellite map. However, this was mainly the result of inaccurate assumptions about the number of households living in multi-household dwellings. While the correct blocks were identified as multi-household dwellings based on the size of shadows and Google Streetview, most of these dwellings were occupied by 2-3 households and not six, as assumed.
This discrepancy also shows how prior knowledge can improve the accuracy of desk-based estimates. Speaking with locals beforehand or doing a quick pre-field recon means we would have had information on the average number of households in urban dwellings.
Moving forward, we will be doing a similar verification exercise in all our research areas, to test our assumptions and the accuracy of our desk-based counts. We have now included the critical step of talking to people familiar with the research area to verify assumptions made by country coordinators, in this case from 4,000km away, in our survey methodology. But – with these additional checks in place – counting households using satellite maps is a good solution when there are no reliable sampling frames.