Big Data in Development: Opportunities for Public Policy

Commonly abbreviated as ICT4D, “Information and Communication Technologies for Development” is disrupting the manner in which governments and public affair organisations approach socio-economic development. The unprecedented increase of mobile phone usage and the rise in adoption of social networks has given us datasets which were non-existent and rather inconsequential a couple of decades ago. As more and more people willingly share their personal information and details of their life on the web, it leads to a wealth of data which – if mined correctly – can power not only businesses but also public policy.

The page you liked on Facebook the other day, for example, speaks of your preferences and interests. The tweet that you re-tweeted a month back, gives the world a glimpse of the values and beliefs you stand for. The mobile calls you made over the last week and the duration of these calls, gives a peek into your connections and social network. The GPS on your cell phone helps map the places you visited over the last year. There are billions of raw, unstructured, incomprehensible data points. What could possibly be derived from it, especially in the context of the developing world? Lets have a look!

What if a stranger was to come up to you asking for a loan? Would you trust him or her your money? Would you ask for more information to gauge his or her credibility? Sounds like a lot of work, right? Imagine the plight of formal loan institutions in lending to the “unbanked” with no credit history while ensuring low default rates. Given the risk involved, formal financial institutions avoid lending to people with no former credit history and no collateral to offer, leading the world’s unbanked poor to resort to extremely high interest rates offered by informal money lenders. However, research conducted using cellular details records [1] and transaction data is suggesting a novel approach to calculating the credit score of an individual based on his or her mobile transaction data. But how can a person’s mobile usage patterns and credit worthiness have anything in common? The key here is the behaviour patterns that is reflected by a person’s mobile usage and social network which can then be mapped to deduce his or her credibility in terms of loan repayment. Positive behavioural indicators may range from timely mobile recharges, payment of postpaid bills before the due date or even the fact that a person’s call is promptly returned by his or her social network. If a person is known to have a continuous record of timely payments, always holds a positive balance in the account, the algorithm would classify him or her as a trustworthy potential borrower and make it eligible for the loan he or she applied for.

Moving on to a broader context, how does one analyse the wealth of a nation? Is GDP a sufficient measure of a nation’s standing? In a country like India, where the richest 10% earned 55% of the total income in 2016, there exists a widening income inequality that needs to be bridged. Welfare services are not meeting the demand due to lack of insights on who needs it. Joshua Blummenstock in his research paper titled ‘Predicting poverty and wealth using mobile phone metadata’ [2] proposes the use of call records data (CDR) to measure the wealth of a population. Along with CDR, he uses data collected through phone surveys for predicting the financial status of the population through a combination of feature selection techniques, supervised learning algorithms and out of sample predictions. The outcome of the study conducted on data from Rwanda is illustrated here:

Figure 1. Construction of high-resolution maps of poverty and wealth from call records. Information derived from the call records of 1.5 million subscribers is overlaid on a map of Rwanda. [2]

Prof. Blummenstock further extends this approach to measuring migration and mobility. With Rwanda as the geography under study, in his research paper titled ‘Inferring patterns of internal migration from mobile phone call records’ [3] he applies a combination of call record details and phone surveys to match demographic factors with migration trends of people. The study uses three primary factors to infer migration of an individual: the number of cell phone towers used by the individual during time period, the maximum distance between the set of towers and the radius of gyration, which is the root mean square distance of all of the other locations the individual visits from his or her centre of gravity. A person’s centre of gravity is the weighted average of all of the points from which the individual makes or receives a call, where the weight is determined by the number of times the individual calls from each location. On the basis of these factors, the algorithm then infers if the individual migrated in a given month by determining if the individual moved to a new location for a number of months after having stayed for a considerable duration in a particular location.

Although ICT-generated data counters the problems in data quality and the cost associated with collecting data through traditional methods like government censuses and household surveys, it has its own set of challenges and limitations which need to be taken into account. Firstly, privacy is a major concern while dealing with data of such nature which contains personally identifiable information of each individual along with a history of their past travels [3]. Such data contains a depth of personal information which can be misused if not protected with the right security measures. Apart from the ethical concerns, insights derived from ICT-generated data might lead to sampling biases in case of a non-representative sample of the population. In case of inferring migration patterns from mobile usage patterns of Rwandans, there is a significant difference in the behaviour and lifestyle of those who use mobile phones versus those who don’t. Hence, inferences drawn for mobile phone users cannot be extrapolated to non-mobile phone users. Lastly, it is extremely difficult to obtain such sensitive data from mobile phone operators who are skeptical of the sharing customer data for research purposes.

Despite the potential challenges involved in dealing with ICT-generated datasets, the possibilities of the information that can be amassed from it is significant. From promoting financial inclusion to gauging wealth of nations, the applications of such datasets can be wide and varied in the context of socio-economic development. With 4.77 billion mobile users in the world and the number constantly growing, there is a wealth of data ready to be tapped to study human development and inform effective public policies.


[1] Bjorkegren, Daniel, and Darrell Grissen. “Behavior Revealed in Mobile Phone Usage Predicts Loan Repayment”. SSRN, 13 July 2015. Available at:

[2] J. Blumenstock, G. Cadamuro, R. “On Predicting Poverty and Wealth from Mobile Phone Metadata”. Science, 350 (6264): 1073 (2015). DOI: 10.1126/science.aac4420

[3] J. E. Blumenstock. “Inferring Patterns of Internal Migration from Mobile Phone Call Records: Evidence from Rwanda”. Information Technology Development 18, 107–125 (2012). DOI: 10.1080/02681102.2011.643209



An Indian by nationality, an engineer by profession and a tinkerer by habit, Asra aspires to explore the world beyond its confines and come up with solutions that can drive change, thereby promoting better quality of life for all. A Bachelors in Technology specializing in Computer Science & Engineering, Asra has worked for a leading Bay Area company for two years before taking up the AIF Clinton Fellowship. With the goal of applying technology and analytics to the field of developmental policy, Asra is seeking to answer some of the pertinent challenges faced by India's development sector. During her tenure as a Fellow, Asra will be working with IFMR LEAD in the domain of data science for public policy. When she is not mulling over life and its intricacies, you might find her engrossed in a book or enjoying an engaging conversation.

Leave a Reply

Your email address will not be published. Required fields are marked *