Choosing the right Machine Learning algorithms

Choosing the right Machine Learning algorithms

“Google’s self-driving cars and robots get a lot of press, but the company’s real future is in machine learning, the technology that enables computers to get smarter and more personal”

                                – Eric Schmidt, Google Chairman  –

Choosing the right Machine Learning algorithms

Machine Learning is a form of Artificial Intelligence (AI) which allows computers to learn by way of observation and experience, rather than rigid pre-programming.

Even though Machine Learning systems have been around since the 50s, now, there are three factors at play:

  1. Enormous increase of data
  2. More powerful computer hardware
  3. Improved algorithms.

These factors are the reason why Machine Learning is much more important nowadays than it was in the 50s.

Selecting the right algorithm is a key part of any Machine Learning project, and because there are dozens to choose from, understanding their strengths and weaknesses in various business applications is essential. Machine Learning algorithms can predict patterns based on previous experiences. These algorithms find predictable, repeatable patterns that can be applied to e-commerce, data management, and new technologies such as driverless cars.

There are three types of Machine Learning algorithms:

  1. SUPERVISED LEARNING: It is a function approximation, where basically we train an algorithm, and in the end of the process, we pick the function that best describes the input data, the one that for a given X makes the best estimation of y (X -> y).

Choosing the right Machine Learning algorithms

Most of the time, we are not able to figure out the true function that always makes the correct predictions. The main types of Supervised learning problems include regression and classification problems.

Regression problems refer to when the output value is a continuous number, for example, a probability. An example of a regression is when the result is a number between 0 and 1 that represents the probability that a person will pay his debts.

On the other hand, Classification problems refer to when the output value belongs to a discrete and finite set. Tweets classification in positive, negative, or neutral is an example of how a Classification problem can be solved.

Content related: Machine Learning – Classification

Supervised Learning is the most popular category of Machine Learning algorithms. The disadvantage of using this approach is that for every training example, we have to provide the correct output, and in many cases, this is quite expensive.

For example, in the case of sentiment analysis, if we need 10,000 training examples (tweets), we would have to tag each tweet with the correct sentiment (positive, negative, or neutral). That would require a group of humans to read and tag each tweet (quite a time consuming and boring task). This is usually a very common bottleneck for Machine Learning algorithms: gather quality tagged training data.

  1. UNSUPERVISED LEARNING: In this algorithm, we do not have any target or outcome variable to predict or estimate. It is used for clustering a population into different groups, which is widely used for segmenting customers into different groups for a specific intervention.

Choosing the right Machine Learning algorithms

The main types of Unsupervised learning algorithms include Clustering algorithms and Association Rule Learning algorithms. Here, the program is given a huge amount of data, and it must find patterns and relationships therein. An example is when we have a news article and we want to get similar ones to recommend.

  1. REINFORCEMENT LEARNING: Using this algorithm, the machine is trained to make specific decisions.

It works this way: the machine is exposed to an environment where it trains itself continually using trial and error. This machine learns from past experiences and tries to capture the best possible knowledge to make accurate decisions. Some applications of the reinforcement learning algorithms are computer played board games (Chess, Go), robotic hands, and self-driving cars.

A common principle is that Machine Learning algorithms try to make generalizations. That is, they try to explain something with the simplest theory (Occam’s razor principle). Every Machine learning algorithm will try to create the simplest hypothesis (the one that makes fewest assumptions) that explains most of the training examples.

The goal of Machine learning is never to make perfect guesses because Machine learning deals in domains where there is no such thing. The goal is to make guesses that are good enough to be useful.

There are many interesting algorithms available to work with, but I will list just a few of them:

  • Random forest: Decision trees use directed graphs to model decision making. Each node on the graph represents a question about the data (“Is income greater than $18,000?”), and the branches stemming from each node represent the possible answers to that question. Compounding hundreds or even thousands of these decision trees is an ensemble method called a Random forest.
  • Neural networks: The goal of artificial neural network Machine Learning algorithms is to mimic the way the human brain organizes and understands information in order to arrive at various predictions. In artificial neural networks, information is passed through an input layer, a hidden layer, and an output layer. The input and output layers can be comprised of raw features and predictions, respectively. The hidden layer in-between consists of many highly interconnected neurons capable of complex meta-feature engineering. As the neural network learns the data, the connections between these neurons are fine-tuned until the network yields highly accurate predictions.
  • Logistic regression: Logistic regression, which is borrowed from the field of classical statistics, is one of the simpler Machine Learning algorithms. This technique is commonly used for binary classification problems, meaning those in which there are two possible outcomes that are influenced by one or more explanatory variables. The algorithm estimates the probability of an outcome given by a set of observed variables.
  • Kernel methods: This is used for pattern analysis, which involves organizing raw data into rankings, clusters, or classifications. The most popular application of kernels is the support vector machine, which builds a model that classifies new data as belonging to one category or another based on a set of training examples.
  • K-Means clustering: Clustering is a type of unsupervised learning, which is used when working with data that does not have defined categories or groups (unlabeled data). The goal of Kk-Means clustering is to find distinct groups in the data based on inherent similarities between them rather than predetermined labels. K represents the total number of unique groups the algorithm will create. Each example is assigned to one group or another based on similarity to other examples across a set of characteristics called features. K-Means clustering is useful for business applications like customer segmentation, inventory categorization, and anomaly detection.


Ultimately, the best Machine learning algorithm to use for any given project depends on the data available, how the results will be used, and the data scientist’s domain of expertise on the subject. Understanding how they differ is a key step to ensuring that every predictive model your data scientists build and deploy delivers valuable results.

White Paper

Comments?  Contact us for more information. We’ll quickly get back to you with the information you need.