## Amazing Train: A Poem for Children

Children, let us go on a wonderful train.

We will board it at Big Bang.
In a dazzling explosion of light
Our train will begin journey!

Galaxies will spin
Like fireworks around us.Through the window
We will witness
Greatest wonder
As life in its humblest origin
Of floating specks full of potentialities,
Transforms into men and women and their cities
And their short lived dreams
And their final anguish,
When we will see the Sun finally die
And we will cry for our dearest beautiful Earth.
Train will keep on speeding
Through the strangest vistas
Inside Black Holes
And even beyond.

Trillions of trillions years will pass,
We will be just as amazed
By what we will see and hear and know.
What is all this? Who are we?
In a dream-like landscape,
Our train will just keep on going
Towards Eternity

You and I listening to the most melodious
Vibrations of Space-Time
As finally we become one with the Music.
(May 1, 2012; Written for all science loving children and my kids.)

## Concept of Web Search Engines explained using Snakes and Ladders game

Recently we were playing a homemade Snakes and Ladders game. While I was being ruthlessly swallowed by giant snakes, it struck me that I could explain my children’s usual queries on how web search engines work using the same game.

I have created an educational video for children to inspire them to study mathematics because it is everywhere – Snakes and Ladders as well as cutting edge technologies inside Web Search Engines.

## Machine Learning, Deep Learning and Scientific Understanding

Machine learning is super hot in silicon valley these days! It has emerged as a very useful discipline in computer science and statistics. If you have skills in Machine Learning, you are likely to get a nice job. But what exactly is machine learning? With growing popularity of the field, engineers and scientists know the technical answer, but can it be explained to everyone in simple language? What are current trends in machine learning and what could come next?

In this article, we will focus on statistical learning and discuss state-of-the-art, trends and future directions.

## Decisions, Decisions, Decisions!

Why is machine learning hot? Well, there are so many decisions to be made everywhere. Suppose you want to predict, recommend, classify or rank something in an automated data driven manner. Then your best bet is to use statistical machine learning.

Netflix wants to recommend movies to you which you may like. Google wants to rank web pages depending on their relevance to your search query. Facebook and LinkedIn want to display those advertisements which are likely to be clicked. A biotechnology company wants to offer a diagnostic platform to predict a medical condition depending on gene and protein expression. When you wave your hand in front of Kinect, it tries to classify its 3-D depth sensor data into different body parts and understand your gesture. A retailer wants to predict demand of inventory items. Self-driving cars want to understand immediate surrounding traffic. And the list goes on.

In future, a household robot will recognize faces, understand gestures, facial expressions and speech, and it will be able to move around your house helping out in chores. It will have to make a lot of decisions almost non-stop! Machine learning will be the basis for such Artificial Intelligence.

It is not easy to come up with rules of thumb to make decisions for all these problems. There could be very large number of situations to handle and you may not be able to write down a simple recipe which will allow you to make decisions in all possible cases with desired accuracy. There could be a complex underlying process going on which may be quite difficult to capture in a simple handcrafted model.

Machine learning is building software which allows you to make decisions in a statistical sense. You train this software based on training data, which is the data where humans provide their judgement or labels and which are presumed to be correct decisions. Therefore, it captures human intelligence and decision making experience. At a very concrete level, training data can be viewed as an excel table. You have rows and columns. Rows are all different instances or samples of your problem. Columns represent features or signals which you think could be useful to you. There is an extra column which corresponds to the correct decision filled out by human experts.

## Example: Predicting Happiness

To make the discussion livelier, let us apply machine learning to a problem which is subject of so many self-improvement books. We will try to achieve through machine learning, by just using statistics, something which has preoccupied humanity for ages: we will predict happiness!

Happiness is our target or response variable. Using our intuition and on consultation with happiness experts, we make a list of all those factors which we presume to be important in predicting happiness. They could be age, gender, income, relationship status, number of children, political beliefs, religious beliefs, job satisfaction, number of friends, personality type, and so on, which will constitute our features, signals or predictor variables. Let us say we make a list of 20 such features. We then go around the world taking a survey of say 5000 people. For each person, we get the values of 20 features. The training data as an excel table will have 5000 rows and 21 columns. Why 21 columns? First 20 columns are the features. The last 21st column is the target variable which indicates how happy the person is, for example, on a scale from 0 to 10. How to fill out this 21st column in our happiness project? Well, here human judgement, experience and intelligence come into play. It could be self-reported happiness level or it could be something which is computed by certain experts whose task is to assign each person a happiness value. An important point is that this value will be assumed to be correct. That is why sometimes training data is called Golden dataset or Ground Truth and this process of using labeled data is called supervised learning.

Once we have our 5000 rows, 21 columns excel table filled out, we can then use it as an input to a machine learning training algorithm which will try to build a mathematical function or model that will map predictor variables into the output variable. The output of this supervised learning is the trained model which we will then use in practice to predict happiness of any person.

What is the form of this model which we trained? Machine learning literature gives you many choices, from very simple ones to quite complex ones. Let us say we use them as black boxes and simply try all of them out using a brute force approach and see which one works best.

The training algorithm will also tell us how well we succeeded in training. Suppose it tells us that we achieved high accuracy. Can we then open Champagne bottles and start celebrating on solving this age old problem? No. What we will need to do is to then apply our model on the data which we have not seen. This is called test data and we are shown this data only once after the training has been completed. This is our real examination. If we do really well in this test set and assuming this test set is fairly big and representative, then yes, we can definitely celebrate! 🙂 And it will be another jewel in the crown of machine learning.

## Machine Learning Concepts

While we are trying to predict happiness we may read few books on machine learning. In machine learning literature, we would encounter concepts such training error, generalization error, bias-variance tradeoff, VC Dimension, PAC learning, overfitting, underfitting, ROC curves; we will become familiar with models such as linear models, logistic regression, neural networks, support vector machines, bayes classifier, decision trees, nearest neighbors, probabilistic graphical models, generative and discriminative models; we will learn about gradient descent, convex optimization, clustering, expectation maximization, boosting, bagging, bootstrapping, monte carlo techniques, cross validation, dimensionality reduction, regularization; and many other things. Knowledge of linear algebra and statistics will be quite handy to master these concepts.

All these will give us theoretical understanding of what is going on in machine learning as well a set of practical tools. We will discover that machine learning is an empirical science based on rich theoretical foundation which requires a lot of experimentation and iterative continuing improvement cycles. We will realize that we need rich visualization tools which help us discover patterns in our data and in our results so that we can make these continuing improvements.

## Deep Learning

In our example of happiness prediction, we used our intuition about and our understanding of happiness to list 20 features which we thought were important in predicting happiness. Depending on the application, we will come up with an appropriate list of features which we think are important. For example, for Google, PageRank is an important and well-known feature to rank web documents. If we are trying to classify a digital image, we will use certain computer vision features, for example, those based on image intensity gradients. This is called Feature Engineering. Lot of innovation is required in designing such useful features and therefore those publications which propose such features get well cited.

Let us now stop briefly and make two observations. One crucial observation is that we live in a world which is best modeled in a hierarchical manner, from elementary particles to giant galaxies, from raw pixels in your digital camera image to a familiar face, from vibrations of air molecules to a Beethoven’s symphony, from simple realities within us and around us to the elusive concept of happiness.

Another crucial observation is that human mind, which is our best model for intelligence, does not exactly go through the process of supervised learning. A human baby looks around the world and figures out a lot of patterns in an unsupervised manner. Parents and teachers only provide a gentle supervised touch to this innate learning process. We follow process of building of models of the world based on evidence and gradual refinement.

How exactly unsupervised learning and supervised learning will both work together to give the best solution is still being researched. It is worth noting that supervised approaches continue to perform well and when we do have large amount of labeled data and we use layers of hierarchical features which are learned automatically then supervised learning offer competitive solutions and outperform unsupervised learning. At the time of writing of this article, convolutional neural networks which are trained using supervised approach seem to be performing the best for computer vision.

Can we somehow capture these ideas and improve classical supervised machine learning? It could be also very useful as this is the era of Big Data. Enormous amount of web data, mobile data, image data, video data, social networking data, customer data, biological and medical data, is being collected in giant data server farms and it is practically impossible to label this humongous data as in classical supervised machine learning.

An unsupervised or semi-supervised approach which works with unlabeled data seems to be our better bet from a practical point of view. We can try to build hierarchical representations of features automatically using unlabeled data just like human baby does, through an iterative process of model building, model matching and model refinement, some components of which could be based on supervised learning. Human brain is estimated to be a hierarchical deep network of neurons with many layers. Starting with raw data, from our eyes through the optic nerve to visual cortex or from our ears through the auditory nerve to auditory cortex, it builds a hierarchical representation of features, which ultimately leads to recognition of a familiar face or to appreciation of a beautiful song.

Deep learning derives its motivation from this biology. Deep learning is a technique currently being researched by machine learning community in which we train hierarchical features in an unsupervised manner using huge amounts of unlabeled data and which are then fine tuned further in classical supervised manner using much smaller amount of labeled data. We are therefore automating the process of feature engineering which earlier used to require human ingenuity.

Consider the example of classifying an image as that of a cat. One can feed lots of images, both of cats and non-cats, perhaps using millions of youtube videos (as was recently demonstrated by a machine learning team led by Prof. Andrew Ng) to deep learning algorithm and let it combine raw pixels into features such as edges, and then next level of features such as composite edges, corners and basic textures, and then next level of features such as eyes, ears, fur, etc., till we get a high level feature which puts them all together into a cat’s face.

Coming back to human baby example, it seems 3 billion years evolution have trained us with first layers of these robust vision features, and then aided by amazing flexibility of human brain, a baby has no trouble in training higher layers to recognize cars, table lamps, people, butterflies, flowers, etc. So evolution of mind and flexibility of mind go hand-in-hand in creating human intelligence! Parents, teachers and other people are still important as they train the highest layer which gives us social and emotional intelligence.

## Artificial Intelligence and Natural Intelligence

Since deep learning has strong biological motivations, are we then moving towards a future in which the line between natural intelligence and artificial intelligence is blurring? It is exciting to see that computer scientists and neuroscientists can work together now to unravel mysteries of human mind!

At the same time, we should realize that machine learning which includes deep learning is best explained in terms of mathematics. We are really training a mathematical model which may or may not correspond to how human mind works but still it will have all the appearance of intelligence in functional form. Machine learning tries to replicate mapping of features to correct predictions using whatever works best in practice. Therefore, such software appears intelligent when viewed as a black box, but it could be employing a totally different mechanism to perform this mapping than what we employ in our brain.

Which one is higher intelligence, artificial or human? Which one has better long-term potential?

Since human mind is only one of the ways to do this mapping, despite our anthropocentric mindset, there exist pure mathematical models out there which significantly outperform human mind and which hopefully will be discovered by us at some point. When coupled with the fact that cloud computing of future will be able to train and execute these models at a mind boggling scale and speed, it is reasonable to predict that artificial intelligence will eventually surpass human intelligence! It should make us humble as well as proud.

## Scientific Understanding

That all sounds very exciting, useful and practical. In era of Big Data computing, machine learning which includes deep learning is a great tool. But is it just useful for businesses? Aren’t we truth seekers and not just utilitarian? Does it lead to better understanding of life and universe?

Let us say we did a great job in predicting happiness using statistical machine learning. Does this model tell us some new truths about happiness? How do these features really affect happiness? What are underlying processes and cause-effect relationships? Did we merely capture statistical correlations and nothing more? Where are psychological truths? Happiness is a difficult topic and it involves social and political realities. Does our model teach us how we can create better societies which enhance happiness?

Taking a more concrete and down-to-earth example, suppose we used genomic and proteomic data to predict a disease using a machine learning model such as neural network. Even if it achieves high accuracy, does it tell us anything useful about underlying biological pathways? Understanding how genes and proteins interact with each other and under influence of epigenetic environmental factors is as difficult as unravelling paradoxes of happiness.

Machine learning used as a black box therefore seems to be just a statistical tool devoid of truths, at least at first glance.

Good news is that we can make it as a tool for scientific understanding. This is one exciting area where this field can grow and become mature in both applied and theoretical sense! How can we interpret machine learning models? What insights can it give us about the problem at hand? How can it assist truth seekers and at the same time give something useful to utilitarian?

We want to better interpret the machine learning models we build. This desire to interpret machine learning models need not be just a goal to help science but it can be rooted in pragmatic goals of business. A business may not want to see unexpected blunders and errors by its trained machine learning model. It may opt for a machine learning model which is simpler and therefore amenable to human interpretation in order to avoid such errors. Once number of features becomes large and we start employing very complex machine learning models, we lose understanding and therefore control, which can cause uneasiness among business leaders.

Interesting work in future can be done in exploring such high-dimensional feature spaces, complexity of models and their simplification, feature interactions and underlying dynamics of processes, and unexpected errors.

Therefore, we should resist our immediate temptation to call machine learning a statistical tool for practical business goals in contrast with science which tries to understand reality.

We should also remember that our scientific laws are also mathematical models. Newton’s law of gravitation was a mathematical equation till it got superseded by Einstein’s space-time curvature. Though it is an amusing story, for Newton the training data consisted of an apple which fell on his head, but it was enough for building of his great model that worked well for both apple and the moon. 🙂 Quantum Mechanics is a mathematical model. We are struggling to interpret Quantum Mechanics. It is just a useful tool or is it truthful depiction of reality? We believe in these mathematical theories in statistical sense and that may not be too far from machine learning. We do experiments, which is like our test set, and we confirm the predictions of these models. This is the scientific method. But for Truth, our bar is highest possible and we firmly demand 100% accuracy. Even a single violation sends us back to search for better theory. Newtonian mechanics is approximate and quite useful in practice but not exact. Einstein has provided us with better theory. And the search goes on.

## Collaboration in Machine Learning

One effective way we will be able to make progress in machine learning will be through collaboration. One of the bottlenecks in machine learning is training time. As we bridge gap between artificial intelligence and human intelligence, we will have to replicate what three billion years of evolution did for natural intelligence. More complex models are built on top of simple models.

To aid in this incremental and hierarchical improvement, successful models which we build in academia and open research labs can be released in public domain. There could be an open-source initiative in which we keep a repository of machine learning models. Like evolution of natural intelligence in nature, artificial intelligence will evolve with time and through social community effort. We should be able to reuse, modify and enhance artificial intelligence models built by others. Of course, it will need some effort on the side of data standardization, pre-processing and post processing, but this should be a solvable technical problem.

That concludes this article. It is hoped that scientific machine learning, which aims to understand, and social machine learning, which aims to replicate evolution, would lead us to new exciting frontiers in coming decades.

## Conscious Capitalism: An Effective Model to Motivate and Organize Human Effort

Like many people, I have had mixed experiences with corporations and capitalism. I have worked in places where employees were happy and creative and in those where things could have been better.

However, seeing things from employee’s perspective does not give complete picture. One should try to understand a particular business from evolutionary point of view. How was the business started? What were the entrepreneur’s original motivations? What stage is the organization currently in?

A straightforward stereotypical answer is that the business was started in order to make money. And the goal of the company continues to maximize profits, it exists only for the sake of its shareholders and the sole purpose of the management is to organize the company in order to attain this profit making goal.

This simplistic answer is being challenged these days. A book titled Conscious Capitalism: Liberating the Heroic Spirit of Business, by John Mackey and Rajendra Sisodia, published in 2013, offers a new answer which is the topic of this post.

Conscious capitalism is holistic, humanistic and value driven. An entrepreneur following conscious capitalism is motivated by several things. One of them may be making money and maximizing profits, but there could be several other more powerful motivations such as doing service to others, creating something new, or changing the world. Furthermore, in conscious capitalism a corporation cares about all of its stakeholders and not just shareholders. Stakeholders include investors, customers, employees, suppliers, partners and society.

These ideas are not completely new. In fact, I was reading Henry Ford’s quotes and some of them resonated with tenets of conscious capitalism. Henry Ford not only changed the way people travelled but he took one amazing step in 1914 when he doubled the wages of his workers. In Strategic Management: A Stakeholder Approach published in 1984, R. Edward Freeman advocated stakeholder theory which provides the conceptual foundation for conscious capitalism.

In Firms of Endearment book published in 2007, Rajendra Sisodia, David Wolfe and Jagdish Sheth, have analyzed the performance of companies which follow conscious capitalism. These widely loved companies outperformed S&P 500 by a factor of almost 10 over a 10 year period! This list of Firms of Endearment has 30 companies which have earned reputation for being great companies.

By the way, John Mackey is co-CEO and co-Founder of Whole Foods and in the writing of Conscious Capitalism 2013 book, he teamed up with Rajendra Sisodia. This has given the book a first hand credible account of building a great business which is driven by strong values and by a holistic vision.

One thing which comes out of the conscious capitalism movement is to look at the constituents of a business as human beings. Humans have deep psychological drives of love, caring, cooperation, exploration and self-actualization. We want to live meaningful lives. And, some of us want to go even beyond that! As in Star Trek we are explorers who want to go where no one has gone before.

This is what I call as human effort. A desire to explore the world, to make it a better place and to solve grand challenges which surround us!

Conscious capitalism in democratic societies is an effective model for this human effort. It looks at capitalism as this effort embedded in a larger human setting and which is aware of its higher calling. This model motivates us further towards new exciting territories, organizes us in effective teams, and at the same time makes our humanity even stronger than before. I hope that this movement will succeed and we will see the list of widely loved and successful companies grow in coming years.

## On Bringing Excitement and Fun to Computer Science Education

In engineering departments around the world, at present emphasis is on imparting technical and analytical skills to students so that they become good engineers and scientists. Furthermore, if professors want to gain recognition and have tenure, there is no choice for them but to pursue a highly specialized research area.

And in this rush to create engineers and to do highly specialized research, one loses big picture!

What is the big picture? Big picture is that of intellectual pursuit for sake of joy and of an education system which should create well-rounded complete human beings rather than just engineers. Students graduating from technical institutes should have the option of becoming entrepreneurs, social leaders, teachers, artists, writers, and not just engineers. Research should demolish interdisciplinary boundaries and should offer new alternatives to super-specialization career path.

In Science, Order and Creativity, David Bohm and F. David Peat argue how science has lost its wholeness by getting fragmented into narrow disciplines. They call for renewed emphasis on ideas rather than formulae and on understanding of the whole rather than mechanics of parts, and for playfulness in science and in life. According to David Bohm, the division of science and art is temporary. Just as art consists not simply of works of art but of an attitude, the artistic spirit, so does science consist not in the accumulation of knowledge but in the creation of fresh modes of perception.

In an address to faculty and students of IIT Kanpur, India, in 2009, Professor Yash Pal emphasised the importance of joy in academic work, of abolishing boundaries between disciplines and of unconventional way of thinking and working.

Now let us focus on Computer Science which is the title of this article. There are some challenges specific to Computer Science.

First, Computer Science as taught in Universities seems to have lost some of its identity. Other disciplines such as Bioinformatics, Biological Sciences, Mathematics, Electrical Engineering, Mechanical Engineering, Physics, have their own computer related courses in which they teach applications of computer technology to their respective fields. Some Computer Science Departments gradually have become service departments teaching programming to students majoring in other fields.

Second, interest in Computer Science is not great among high school students. And among girls, Computer Science is definitely not a popular field of study. It is considered a discipline which is boring and unattractive, reducing one to a programmer working in an impersonal cubicle with little human interaction. This does however contrast with situation in developing countries such as India, where students are more worried about employment and therefore Computer Science is considered appealing from the point of view of career prospects.

At the same time, it is widely accepted that there will be shortage of engineers in coming years in the USA. Therefore, it is imperative to improve interest of high school students in STEM (Science, Technology, Engineering, and Mathematics) fields.

Some departments have taken notice of the problem and have started offering more innovative curriculum. An example is College of Computing of Georgia Tech. A Computer Science undergraduate there has the option of selecting one of the following threads: modelling and simulation, devices, people, systems and architecture, theory, information internetworks, intelligence and media. Stanford University offers a degree in Symbolic Systems, in which students probe the meaning of intelligence, in natural and artificial symbolic systems.

To revitalize Computer Science and to make it attractive to kids in school, an effort should be made to communicate to them about importance of Computer Science and at the same time universities should take another look at their curriculum.

Computer Science holds a very special role in the world of intellect and of sciences, its technology having tremendous impact on society.

Why Computer Science is so special? Because it intertwines with mathematics – notions of computational universality, undecidability and intractability being great intellectual achievements of 20th century; it makes trains and cars run and planes fly; it makes robotic rovers roam the surface of planets; it makes internet – Google, Facebook, Wikipedia, are all household phrases; it makes Hollywood produce some finest movies with amazing special effects; it brings medical sciences closer to victory over diseases; it organizes huge amounts of data for people to make decisions about most pressing problems such as environment, energy, water, nutrition and shelter; it makes people talk to each other with their mobile phones; it provides tools to social and political scientists, economists and psychologists so that they can make models and simulate them for better understanding of their theories; and it gives new medium to artists, writers and poets for self-expression.

Computer Science pervades fabric of modern life everywhere and is cool! Can this message be communicated in school to high school children so that they feel excited about pursuing it as their major?

Perhaps we need courses which provide thought-provoking, stimulating look at Computer Science and how it intersects boundaries with Mathematics, Sciences, Arts and Humanities. Students should study how Computer Science and related technologies have had broad and deep impact, allowing them to see the discipline as an integral component of human endeavour to understand world and to make it a better place. This will bring excitement and fun to the field.

In such interdisciplinary approach to curriculum, one can have contributions from experts from other fields in building a holistic vision of the field by having them highlight how Computer Science helps them in their respective fields and by having them propose new cross-disciplinary ideas for productive and socially relevant research.

One recent example of multi-disciplinary nature of Computer Science is Deep Learning, which is an active research area in Machine Learning and which has biological motivations. Collaboration by biologists and computer scientists may lead to better understanding of how human brain works.

Can Computer Science rise up to a leadership role in intellectual pursuits of humankind and prove itself worthy of being a great science? It is hoped that it will play an instrumental role in propelling humankind to new levels of achievement. It is also hoped that this article will stimulate discussions in Computer Science departments as they design new curriculum.

## Developing Mathematical Intuition in Children

I sometimes try to convey mathematical concepts and results to children without going through proofs to varying degree of success. I have wondered if even the deepest mathematical concepts can be conveyed to anyone in an intuitive manner. I am intrigued by the notion of human intuition. What makes us understand and see things intuitively? When is it possible?

Consider the result that there are more reals than natural numbers but there are as many integers as natural numbers or odd numbers or even numbers or prime numbers. All are infinite but infinity comes in different sizes. Integers are referred to be countably infinite and reals as uncountably infinite. One can start by using analogy of matching girls with boys. Consider integers as girls and even numbers as boys and then each girl can indeed find a boy to marry and vice versa and therefore there are as many integers as even numbers.

But how to convey the idea behind the proof of the result that there are more real numbers than integers without first going through details of diagonalization? Assume here girls are real numbers and boys are integers. Each real number is an infinite sequence of digits. For example, the girl named Pi  is:

$\pi = 3.14159 26535 89793 23846 26433 83279\ldots$

Therefore, one could say that any girl has an infinite description. Irrespective of how you marry them, one can create a new girl who is different from each married girl and is single. You build this new girl from other married girls as a composite, an infinite montage, but changing each piece. Therefore, she will be a totally new girl different from everyone else. One could continue such an intuitive and sketchy exposition by gradually becoming more precise and rigorous, slowly adding details, till the child sees the basic idea.

In my view, such an intuitive exposition should be then followed by actual proof and then a combination of both will make something click in a young mind and enable it to grasp this fundamental concept of degrees of infinity.