Practical Machine Learning with Python and Keras

What is machine learning, and why do we care?

Machine learning is a field of artificial intelligence that uses statistical techniques to give computer systems the ability to “learn” (e.g., progressively improve performance on a specific task) from data, without being explicitly programmed. Think of how efficiently (or not) Gmail detects spam emails, or how good text-to-speech has become with the rise of Siri, Alexa, and Google Home.

Some of the tasks that can be solved by implementing Machine Learning include:

  • Anomaly and fraud detection: Detect unusual patterns in credit card and bank transactions.
  • Prediction: Predict future prices of stocks, exchange rates, and now cryptocurrencies.
  • Image recognition: Identify objects and faces in images.

Machine Learning is an enormous field, and today we’ll be working to analyze just a small subset of it.

Supervised Machine Learning

Supervised learning is one of Machine Learning’s subfields. The idea behind Supervised Learning is that you first teach a system to understand your past data by providing many examples to a specific problem and desired output. Then, once the system is “trained”, you can show it new inputs in order to predict the outputs.

How would you build an email spam detector? One way to do it is through intuition – manually defining rules that make sense: such as “contains the word money”, or “contains the word ‘Western Union’”. While manually built rule-based systems can work sometimes, others it becomes hard to create or identify patterns and rules based only on human intuition. By using Supervised Learning, we can train systems to learn the underlying rules and patterns automatically with a lot of past spam data. Once our spam detector is trained, we can feed it new a new email so that it can predict how likely an email is a spam.

Earlier I mentioned that you can use Supervised Learning to predict an output. There are two primary kinds of supervised learning problems: regression and classification.

  • In regression problems, we try to predict a continuous output. For example, predicting the price (real value) of a house when given its size.
  • In classification problems, we try to predict a discrete number of categorical labels. For example, predicting if an email is spam or not given the number of words within it.

You can’t talk about Supervised Machine Learning without talking about supervised learning models – it’s like talking about programming without mentioning programming languages or data structures. In fact, the learning models are the structures that are “trained,” and their weights or structure change internally as they mold and understand what we are trying to predict. There are plenty of supervised learning models, some of the ones I have personally used are:

  • Random Forest
  • Naive Bayes
  • Logistic Regression
  • K Nearest Neighbors

Today we’ll be using Artificial Neural Networks (ANNs) as our model of choice.

Understanding Artificial Neural Networks

ANNs are named this way because their internal structure is meant to mimic the human brain. A human brain consists of neurons and synapses that connect these neurons with each other, and when these neurons are stimulated, they “activate” other neurons in our brain through electricity.

In the world of ANNs, each neuron is “activated” by first computing the weighted sum of its incoming inputs (other neurons from the previous layer), and then running the result through activation function. When a neuron is activated, it will, in turn, activate other neurons that will perform similar computations, causing a chain reaction between all the neurons of all the layers.

It’s worth mentioning that, while ANNs are inspired by biological neurons, they are in no way comparable.

  • What the diagram above is describing here is the entire activation process that every neuron goes through. Let’s look at it together from left to right.
  • All the inputs (numerical values) from the incoming neurons are read. The incoming inputs are identified as x1..xn
  • Each input is multiplied by the weight associated with that connection. The weights associated with the connections here are denoted as W1j..Wnj.
  • All the weighted inputs are summed together and passed into the activation function. The activation function reads the single summed weighted input and transforms it into a new numerical value.K Nearest Neighbors
  • Finally, the numerical value that was returned by the activation function will then be the input of another neuron in another layer.

Neural Network layers

Neurons inside the ANN are arranged into layers. Layers are a way to give structure to the Neural Network, each layer will contain 1 or more neurons. A Neural Network will usually have 3 or more layers. There are 2 special layers that are always defined, which are the input and the output layer

Back to Top