
Evidence, not narratives, should guide discussions about statistics Premium
The Hindu
In a recent opinion article, Economic Advisory Council member Shamika Ravi article raised concerns about the quality of data that India’s national surveys – the NSS, the PLFS, and the NFHS – collect. Ravi raises two main issues: overestimation of rural populations and different response rates across wealth groups proxied by income/expenditure, with lower response rates in wealthier groups. The combined inference is that these surveys may be biased towards underestimating urban, wealthier groups. This article explains the problems in Ravi’s article and how they undermine her argument.
In a recent opinion article, Economic Advisory Council member Shamika Ravi article raised concerns about the quality of data that India’s national surveys – the National Sample Survey (NSS), the Periodic Labour Force Survey (PLFS), and the National Family Health Survey (NFHS) – collect.
Dr. Ravi raises two main issues: overestimation of rural populations and different response rates across wealth groups proxied by income/expenditure, with lower response rates in wealthier groups. The combined inference is that these surveys may be biased towards underestimating urban, wealthier groups.
We agree that Dr. Ravi has valid concerns about data quality and about representativeness or generalisability. It would be safe to assume such issues should concern only the statisticians assisting the survey design or the researchers and analysts using these data for insights. But Dr. Ravi suggests that these issues concern us all because they “systematically underestimate India’s progress and development”.
If we agree on the existence of data quality issues, we must assess their magnitude. The two points of discussion on the overestimation of the rural population are its depiction and the extent.
Other responses to Dr. Ravi’s article have noted that an accompanying graph, depicting the rural population percentage, was misleading because the x-axis had been truncated. Dr. Ravi has responded that the “grammar of graphics” supports her visualisation choice. We disagree. Truncating an axis, especially without explicit breaks or an accompanying explanation, is a well-documented problem.
Multiple studies have shown that axis truncation leads to a distorted perception of the effect size, i.e. readers view differences to be larger than they really are. Leading scientific publishers, including Nature and the American Medical Association, advise against truncated axes.
(For all discussion below, we focus on NSS data, but similar results can be demonstrated for PLFS and NFHS data as well.)

Thomas Jefferson and Abraham Lincoln are two of the greatest presidents that the U.S. has seen. You probably know that already. But did you know that Jefferson made what is considered the first contribution to American vertebrate paleontology? Or that Lincoln is the only U.S. president to receive a patent? What’s more, both their contributions have March 10 in common… 52 years apart. A.S.Ganesh hands you the details…