What is Machine Learning?

As programmers, we often approach problems in a methodical, logic-based way. We try to determine what our desired outputs should be, and then create the proper rules that will transform our inputs into those outputs.

Machine learning flips the script. We want the program itself to learn the rules that describe our data the best, by finding patterns in what we know and applying those patterns to what we don't know.

These algorithms are able to learn. Their performance gets better and better withc each iteration, as it uncovers more hidden trends in the data.

Supervised Learning

Machine learning can be branched out into the following categories:

-Supervised Learning

-Unsupervised Learning

Supervised Learning is where the data is labeled and the program learns to predict the output from the input data. For instance, a supervised learning algorithms for credit card fraud detection would take as input a set of recorded transactions. For each transaction, the program would predict if it is fraudulent or not.

Supervised learning problems can be further grouped into regression and classification problems.

Regression: In regression problems, we are trying to predict a continous-valued output. Examples are:

What is the housing price in New York?
What is the value of cryptocurrencies?

Classification: In classification problems, we are trying to predict a discrete-valued output. Examples are:

Is this picture of a human or a picture of an AI?
Is this email spam or not spam?

Unsupervised Learning

Unsupervised Learning is a type of machine learning where the program learns the inherent structure of the data based on unlabeled examples.

Clustering is a common unspervised machine learning approach that finds patterns and structures in unlabeled data by grouping them into clusters.

Some examples:

Social networks clustering topics in their news feed
Consumer sites clustering users for recommendations
Search engines to group similar objects in one cluster

The Process

When people think of Machine Learning, they often think of a program that is taking in data and spitting out predictions and insights.

The process of performing Machine Learning often requires many more steps before and after the predictive analytics.

We try to think of the Machine Learning process as:

Formulating a Question.
Finding and Understanding the Data.
Cleaning the Data and Feature Engineering.
Choosing a Model.
Tuning and Evaluating the Model.
Using the Model and Presenting Results.

The Role

Data Scientists who specialize in Machine Learning fill a lot of roles on a Data Science team. As a Machine Learning Data Scientist, you might work with predictive modeling and artificial intelligence to solve problems at scale. These are also opportunities to be involved in Decision Science and leveraging machine learning to develop new technologies such as disease identification, customer churn, artificial intelligence, and recommendation systems.

THE USUAL SCOPE OF THE Role

Lead the development and improvement of algorithms to solve business problems.
Work with vast amounts of data and build data-driven products.
Apply knowledge of statistics, machine learning, programming, data modeling, simulation, and advanced mathematics to recognize patterns, identify opportunitites, and make valuable discoveries leading to product development and improvement.
Prototype production-grade algorithms and models that improve the experience of customers and clients
Research and develop algorithms to strengthen the system (anomaly detection, recommendation, forecasting, etc.)
Provide data visualization and presentation of findings to stakeholders
Find new ways to drive business impact and solve complex problems

What to master

Data Science Foundations
Supervised Machine Learning models
Linear Regression
Logistic Regression
K Nearest Neighbors
Decision Trees
Naive Bayes Classifier
Support Vector Machines
Random Forests
Unsupervised Machine Learning models
K-Means Clustering
Principle Component Analysis
Deep Learning with TensorFlow
Regression
Classification
Feature Engineering
Data Transformation
Feature Selection