Can poor numeracy in Africa affect the integrity of questionnaire data?

After learning more about the phenomena of “age heaping”, the statistical rounding of ages in a population, and its relationship to poor numeracy (Ferber & Baten, 2022; Baten, et al., 2022), I started to notice “heaping” effects displaying in other survey data. Particularly, for African countries characterised by low education levels.


Figure 1. Graphs to indicate the “heaping effects for Zimbabwe (ZWE) on the left, and Ethiopia (ETH) on the right. A) shows the number of people who gave each answer (1-10) for each question (each line is a question); and B) indicates the actual density and probability of a 1, 5, 10 compared to the expected density and probabilities if the distribution of answers was normal.


Figure 1 shows that for the countries Zimbabwe (ZWE) and Ethiopia (ETH), across 50 different questions from the World Values Survey (WVS) and thousands of individuals, there is a bias towards selecting a 1, 5 or 10 on a 10-point likert scale. The actual probability of a 1, 5, or 10 (pink line) being selected as an answer can be compared to that which would be expected if the answers followed a normal distribution (blue line) using the mean and standard deviation for the question responses for each country. The effects are clear, but what is causing this "heaping" distribution?

Figure 2. The relationship between heaping (x axis: the greater difference between actual and expected, towards the right, indicates greater heaping) and a country's average highest education.


A preliminary analysis, displayed in Figure 2, shows that there may be a slight negative relationship between the average reported highest education level in each country and the extremity of the “heaping” effects. However, there are clearly countries which do not fit this trend. Further, reported highest education level does not capture practical numeracy skills, which can also be learned “on the street” or “on the job”.


Further investigations are planned to understand whether this effect is due to response bias (i.e. extremity bias (Marshall & Lee, 1998)), cultural differences (i.e. individualistic vs collectivist cultures (Shulruf et al., 2007)), question presentation (e.g. does the presence of a visual aid or descriptive labels remove the effects), how the surveys were conducted (interviews or phone), and how the participants were selected (the sampling method), or whether it may be related to poor numeracy levels, as with age heaping.


It will also be investigated how replicable these effects are across different open source datasets, and different questions (both content-wise and scale types). If found to be persistent, these findings have important implications for how data is collected in developing countries, as well as the findings and theories that have been constructed using such data.


References


Baten, J., Benati, G., & Ferber, S. (2022). Rethinking age heaping again for understanding its possibilities and limitations. The Economic History Review.


Baten, J., & Ferber, S. (2022). Numeracy, Nutrition and Schooling Efficiency in Sub-Saharan Africa-1950 to 2000.


Marshall, R., & Lee, C. (1998). A cross-cultural, between-gender study of extreme response style. ACR European Advances.


Shulruf, B., Hattie, J., & Dixon, R. (2007). Development of a new measurement tool for individualism and collectivism. Journal of Psychoeducational Assessment, 25(4), 385-401.