Data Science for fighting Human Trafficking


Recently one of my dear friends has made me aware of human trafficking problem in India. Human trafficking is a global scourge with its roots in poverty, lack of opportunities, and related difficult socioeconomic conditions.

I am a data scientist and it makes me wonder if Big Data can be used to understand and to solve such social problems. How can we employ tools of modern data science for social good?

At New College of Florida, the honors liberal arts college of Florida, my colleagues and students and I were recently discussing about role of data science for socially relevant projects. This post is inspired by this discussion and my friend’s work in India.

It will have to be interdisciplinary work in which people from social sciences, computer science, economics, policy making, law and enforcement, and non-governmental organizations will have to come together to pool in their efforts.

First step will be data collection. Due to many stakeholders, the data can be collected at multiple points and in different forms. Data could be collected from hotlines, from monitoring of real-time microeconomic factors, and from census and surveys about detailed picture of local society, education, economy and employment.

More than 80% of data science is hard work involved in collecting data and cleaning it. The data will be in different formats, both structured and unstructured. Getting the data is truly a big task in itself. Having lots of real-time good quality clean data can go a long way in making the exercise successful.

The data can be then summarized and visualized. It will portray a complex picture of not only actual incidents and traffic patterns as collected by law and enforcement but of underlying economic, demographic, geographical and social factors.

One can then proceed to statistical analysis of the data to answer many questions:

  • What are the patterns in human trafficking?
  • How can human trafficking be detected?
  • What factors correlate with high incidence of human trafficking?
  • What are potential causes?
  • What law and enforcement techniques are most effective?
  • What is the profile of perpetrators? How do they operate?
  • What is demographic profile of affected areas?
  • What is the profile of victims?
  • How can people in affected areas be educated to combat the problem?
  • What are the sources and destinations of human trafficking at regional, national and international levels?
  • Which programs for victim rehabilitation are most effective?

Here role of statistics and data science is to provide technological tools. These tools come in form of distributed data collection, database software, visualization software, geographical mapping software and statistical techniques.

Backed by data science and deep understanding of ground realities, one will be able to conquer this social evil which plagues humanity. My hope is that such effort will lead to application of data science for social good in many other projects.