This is the first installment of a 3-part series on machine learning. The outline of the 3 installments is:

  1. Machine Learning Introduction
  2. Various implementations of machine learning
  3. Impact to business of machine learning computers

The remaining installments will be published in the next few weeks. I will update this article with links to them.

Can you build a computer that can think for you, and may be run your business? Computers have been an integral part of any business enterprise for a long time now, insomuch that everything we do in business has computers’ finger prints all over. But the primary function of computers has been that of aiding the humans in their running of the business, not actually running it for us. But a brave new generation of computers is on its way that plan on turning this paradigm on its head.

Imagine computers that think, that understand how your business is run, recognize what works and what does not, correct issues as they go along, and most importantly, ones that do this with virtually no aid from you. These computers almost do not exist today, but will be business-as-usual someday.  What does it entail to build such computers? How much of this is hype and how much reality? What are the classes of problems these computers are expected to solve? A branch of study in Artificial Intelligence (AI) called Machine Learning has been trying to answer these questions.

What is machine learning?

A typical computer program is a series of instructions executed on a set of data. The code is supposed to read the data, manipulate and transform it. Machine learning, on the other hand is not about transforming data, instead about recognizing patterns in the data, discovering deep insights in the structure of the data, all on its own. In fact, it is even more than that: It is about a computer that understands data and gets smarter and smarter.

Machine leaning is about creating algorithms that start with a blank slate, and builds up its knowledge as they process and analyze data. The bigger the data, the smarter they get. The do so in a variety of ways.

Take for instance, a hypothetical chess-playing learning machine. This machine first learns the basic rules of chess by watching people playing chess – rook moves horizontally and vertically, bishop moves diagonally, etc.  Then the computer learns the goal of the game – to win. To extend this example further, the computer learns how to achieve the goal. It learns from the games strategies. While the computer is never coded to perform any specific list of chess plays, it is designed to learn them from the input data – studying previous games and reassessing the games it plays against its opponents.

Hypoethetically, one copy of this software is given amateur chess player, and another one to, say, Gary Kasparov (former world chess champion), and both are allowed to play solely against their masters. After a certain period of time, if these two copies are made to go head to head, the latter copy will trounce the former. This is because the latter learned from a superior dataset, one that was accumulated playing against Kasparov.

To Teach or not to Teach

Broadly speaking, there are two categories of learning algorithms: Supervised and Unsupervised. If the input data also known as training-set is clearly labeled – the computer knows what it is looking at – then it is supervised learning. However, if your data is a big clutter of bits, and the computer starts off without knowing what it is looking at, but it learns as it goes along, that would be unsupervised.

For example, a face-recognition learning system processes several images of faces and non-faces from a training-set that is clearly labeled – faces in each image are marked and labeled. During the learning process, the computer isolates the unique features of faces that are not to be found on non-faces. Subsequently, the computer is now equipped with the “knowledge” of how to distinguish between faces and non-faces. The more images of faces and non-faces it processes, the stronger its knowledge is.

On the other hand, in the construction of a computer that classifies human emotions. The training-set contains no labels for facial expressions and how they correspond to human emotions. The system is fed series of video snippets with human faces containing various emotions and their subsequent actions. Your computer is supposed to see the subtle changes in the facial expression and group them into various categories and associate them with the subtle difference in the subsequent actions of these actors.

Given that these two examples of very simple, there may be an overlap in their classification. But the essential idea is this: Supervised is when the computer clearly starts off knowing what it needs to do, and goes on to becoming really good at doing that. On the other hand, in unsupervised learning, the computer typically has no idea what it is looking at or what it’s supposed to find, and then goes on to discover hidden pattern and deep structures in data.

Although different functionalities dictate which learning is more suitable for the specific purposes, when it comes to the context of big data, unsupervised learning algorithms are expected to be heavily used in the near future. Unsupervised learning is well-suited in systems of data which contains deep hierarchical and/or causal relationships between observations and/or latent variables.

Learning algorithms may not clearly fall along this dichotomy. Most algorithms have a combination of the two. Within the same system, some aspects of learning may be supervised, while others may be unsupervised.

Business Intelligence vs. Learning Machines

There are some striking similarities between the data mining components of Business Intelligence suites that are used for pattern recognition, and the actual machine learning implementations. While BI data mining is a set of tools and techniques used by humans to aid in pattern recognition and eventually make better decisions, machine learning is performed primarily to the advantage of the machines themselves — in order to perform better by reorganizing itself. There is a significant overlap in the various techniques used in these two domains of data analysis.

So how does a computer learn?

Another way of classifying learning machines is the expected output. Let us see the kinds of output we expect from these computers.

Regression Analysis: The excepted output of this computer is to find hidden relationships between two or more variables. For e.g., is there any relationship between the weather outside and my sales data.

Classification: The expected output on this learning system is to take a large chunk of data and classify them according to preset categories. For e.g., what criteria can I use to categorize my employees as top performers, team players and laggers.

Cluster Analysis: Similar to classification, cluster analysis takes a series of similar objects and classifies them. The difference between this and the previous method is that clustering has not preset categories. Objects scatter over the data-space, and computer identifies clusters in them. For e.g., identifying customer clusters for targeted marketing

Computers can learn to run your business for you

While the science of machine learning has been flourishing in the scientific and mathematical circles, the business community has been slow to adopt the trends. With the exception of financial institutions and some sales and marketing campaigns, thinking and learning machines have not made much headway. With the ubiquity and popularity of big data infrastructure, such as Hadoop, it is easy to see that the near future hold exciting trends in machine learning in the business.

Big data enables businesses to adopt machine learning technologies. The potential for machine learning as a field of study and its business applications is unlimited. There are so many problems we know, but do not know how to solve. There is a bigger list of problems that we do not even know exists, let alone know how to solve them. If we ever hope to discover these problems and solve them effectively, learning machines are our fiends.