Machine Learning

Written by Veda Konduru Sep 21, 2020

Introduction to Machine Learning.

At its core, Machine Learning attempts to mimic a human brain’s functioning in terms of decision making. Humans are excellent with cognitive intelligence, while machines are good with speed. Machine Learning is about matching cognitive intelligence with speed.

Machine Learning is in use everywhere around you. Your email program identifies a “likely” spam email. Your bank is able to warn you the transactions that are “likely” suspicious. Insurance companies are able to determine which of the claims are “likely” fraudulent. In each of these situations, Machine Learning is at play. Also, notice the word “likely.” It means, the machine can predict the likelihood, but cannot be 100% certain. Legitimate email sometimes ends up in the spam folder. Your bank didn’t know your travel plans and thought the purchases you made overseas are fraudulent.

So how does a Machine learn?

There are a few fundamental concepts we need to understand to appreciate how machine learns and what it’s limitations are.

Take spam detection program as an example. Humans can look at an email and determine whether it is spam or a legitimate email. But for the machine to be able to detect the spam, it needs to be trained. At a high level, the Machine Learning steps are:

The machine needs to be “Trained” with several sample emails along with information whether a particular email is a spam or not.
With this training, Machine knows to look for specific words, phrases and sentences that are typically found in spam emails.
We then test the Machine using “Test Data”, which is a large, random collection of emails that include some spam emails.
Machine identifies whether each of the emails is a spam email or not.
If the accuracy is not satisfactory, we train the Machine with more number of sample emails. If the accuracy is satisfactory, we test with more “Test Data.”
We continue this iteratively until required accuracy is consistently achieved.
We then deploy the Machine to detect email that is “likely” spam, and put in the Spam folder.

Looking at steps above, we can see what makes Machine more accurate in its predictions.

A large number of sample emails.
A wide variety of phrases, words and sentences that are typically found in spam email.
Multiple iterations of testing and training.

When does Machine need retraining?

The accuracy of spam detection is not guaranteed to be high forever. Spammers continuously come up with writing techniques to defeat spam detection programs. So the machine needs to “train and learn” continuously to keep ahead of spammers. We will see how the Machine retrains and learns, later in this series.