What is Machine Learning?
As programmers, we often approach problems in a methodical, logic-based way. We try to determine what our desired outputs should be, and then create the proper rules that will transform our inputs into those outputs.
Machine learning flips the script. We want the program itself to learn the rules that describe our data the best, by finding patterns in what we know and applying those patterns to what we don't know.
These algorithms are able to learn. Their performance gets better and better withc each iteration, as it uncovers more hidden trends in the data.
Supervised Learning
Machine learning can be branched out into the following categories:
-Supervised Learning
-Unsupervised Learning
Supervised Learning is where the data is labeled and the program learns to predict the output from the input data. For instance, a supervised learning algorithms for credit card fraud detection would take as input a set of recorded transactions. For each transaction, the program would predict if it is fraudulent or not.
Supervised learning problems can be further grouped into regression and classification problems.
Regression: In regression problems, we are trying to predict a continous-valued output. Examples are:
- What is the housing price in New York?
- What is the value of cryptocurrencies?
Classification: In classification problems, we are trying to predict a discrete-valued output. Examples are:
- Is this picture of a human or a picture of an AI?
- Is this email spam or not spam?
Unsupervised Learning
Unsupervised Learning is a type of machine learning where the program learns the inherent structure of the data based on unlabeled examples.
Clustering is a common unspervised machine learning approach that finds patterns and structures in unlabeled data by grouping them into clusters.
Some examples:
- Social networks clustering topics in their news feed
- Consumer sites clustering users for recommendations
- Search engines to group similar objects in one cluster
The Process
When people think of Machine Learning, they often think of a program that is taking in data and spitting out predictions and insights.
The process of performing Machine Learning often requires many more steps before and after the predictive analytics.
We try to think of the Machine Learning process as:
- Formulating a Question.
- Finding and Understanding the Data.
- Cleaning the Data and Feature Engineering.
- Choosing a Model.
- Tuning and Evaluating the Model.
- Using the Model and Presenting Results.
The Role
Data Scientists who specialize in Machine Learning fill a lot of roles on a Data Science team. As a Machine Learning Data Scientist, you might work with predictive modeling and artificial intelligence to solve problems at scale. These are also opportunities to be involved in Decision Science and leveraging machine learning to develop new technologies such as disease identification, customer churn, artificial intelligence, and recommendation systems.
THE USUAL SCOPE OF THE Role
- Lead the development and improvement of algorithms to solve business problems.
- Work with vast amounts of data and build data-driven products.
- Apply knowledge of statistics, machine learning, programming, data modeling, simulation, and advanced mathematics to recognize patterns, identify opportunitites, and make valuable discoveries leading to product development and improvement.
- Prototype production-grade algorithms and models that improve the experience of customers and clients
- Research and develop algorithms to strengthen the system (anomaly detection, recommendation, forecasting, etc.)
- Provide data visualization and presentation of findings to stakeholders
- Find new ways to drive business impact and solve complex problems
What to master
-
Data Science Foundations
-
Supervised Machine Learning models
-
Linear Regression
-
Logistic Regression
-
K Nearest Neighbors
-
Decision Trees
-
Naive Bayes Classifier
-
Support Vector Machines
-
Random Forests
-
Unsupervised Machine Learning models
-
K-Means Clustering
-
Principle Component Analysis
-
Deep Learning with TensorFlow
-
Regression
-
Classification
-
Feature Engineering
-
Data Transformation
-
Feature Selection