Learn Data Science From Scratch: Understanding Supervised and Unsupervised Learning and Common Algorithms (Part 4)

A Comprehensive Guide to Machine Learning for Beginners, including Linear Regression, Logistic Regression, Decision Trees, Random Forests, Clustering, Principal Component Analysis (PCA), and More. Explore the Power of Machine Learning and Learn to Build Predictive Models and Extract Insights from Data.

Royston D. Mai, MS
3 min readMar 30, 2023

Machine learning is a subset of artificial intelligence that involves teaching computers to learn from data, without being explicitly programmed. It has become an increasingly important field in recent years, as it allows for the automated analysis of large datasets and the development of predictive models.

In this article, we will discuss the basics of machine learning, including supervised and unsupervised learning, and some of the most common algorithms used in the field.

Photo by Arseny Togulev on Unsplash

Supervised Learning

Supervised learning is a type of machine learning where the algorithm is trained on a labeled dataset, which includes input features and corresponding output labels. The goal of supervised learning is to learn a mapping function from input features to output labels, so that it can make predictions on new data.

There are many different algorithms used in supervised learning, but some of the most common include the following:

  1. Linear Regression: This algorithm is used to model the relationship between a dependent variable and one or more independent variables. The algorithm fits a line to the data, which can then be used to make predictions on new data.
  2. Logistic Regression: This algorithm is similar to linear regression, but is used for binary classification problems, where the output label is either 0 or 1.
  3. Decision Trees: This algorithm involves dividing the input space into regions based on the input features. The algorithm learns to predict the output label based on which region the input falls into.
  4. Random Forests: This algorithm is a variant of decision trees that combines multiple trees to improve the accuracy of the predictions.
Photo by Clarisse Croset on Unsplash

Unsupervised Learning

Unsupervised learning is a type of machine learning where the algorithm is trained on an unlabeled dataset, without any corresponding output labels. The goal of unsupervised learning is to identify patterns and structures in the data, which can then be used for clustering, anomaly detection, or dimensionality reduction.

Some of the most common algorithms used in unsupervised learning include:

  1. Clustering: This algorithm involves grouping similar data points together into clusters. K-means clustering is a popular algorithm, which involves assigning each data point to the cluster with the nearest mean.
  2. Principal Component Analysis (PCA): This algorithm is used for dimensionality reduction by projecting high-dimensional data onto a lower-dimensional space while preserving as much of the variance in the data as possible.

In conclusion, machine learning is an important field that is becoming increasingly popular in many industries. Supervised learning involves training an algorithm on labeled data to make predictions on new data, while unsupervised learning involves identifying patterns and structures in unlabeled data. Some common algorithms include linear regression, logistic regression, decision trees, random forests, clustering, and PCA.

By learning the basics of machine learning, data scientists can build predictive models and extract insights from data, which can help to drive business decisions and improve performance. As the field of machine learning continues to grow and evolve, it will become even more important for businesses to leverage its power to stay ahead of the competition.

--

--