Machine Learning · December 10, 2023

Understanding the 3 Pillars of Machine Learning: Supervised, Unsupervised, and Reinforcement Learning

Mastering the Basics: Dependent and Independent Variables

Before diving deeper, let’s clarify the crucial concepts of dependent and independent variables in machine learning. This understanding is key to unlocking the differences between the three main learning types. To get more details visit the link.

A Look at the Dependent Variable: The Key Differentiator

The presence or absence of the dependent variable is what truly sets supervised, unsupervised, and reinforcement learning apart. Let’s explore each type:

1. Unsupervised Learning: Independent Variables Only

This approach focuses on analyzing data without any pre-defined labels. The model seeks patterns and relationships between independent variables, leading to various applications like:

  • Feature Reduction: Simplifying data by identifying and removing redundant or irrelevant features.
  • Clustering: Grouping similar data points together for analysis.
  • Identifying Brand Attributes: Uncovering the unique characteristics that differentiate a brand from its competitors.
  • Data exploration and dimensionality reduction: Uncovering hidden patterns, clusters, and anomalies within the data.
  • Feature engineering: Extracting relevant features from complex datasets for further analysis.
  • Market segmentation: Grouping similar customers or products based on their characteristics.
  • Recommendation systems: Suggesting items or content that users might be interested in based on their past behavior.
  • Techniques Examples:
    • Clustering:
      • K-means clustering: Partitions data points into a predefined number of clusters based on their similarities.
      • Hierarchical clustering: Builds a hierarchy of clusters by successively merging or splitting groups.
      • Density-based spatial clustering of applications with noise (DBSCAN): Identifies clusters based on density of data points, robust to outliers.
    • Dimensionality reduction:
      • Principal component analysis (PCA): Projects data onto a lower-dimensional space while preserving most of the variance.
      • Locally linear embedding (LLE): Preserves local relationships between data points in a lower-dimensional space.
      • Autoencoders: Learn compressed representations of data that can be used for reconstruction or other tasks.
    • Anomaly detection:
      • One-class Support Vector Machines (OCSVM): Creates a boundary around the normal data, identifying points that fall outside as anomalies.
      • Isolation Forest: Isolates anomalies by randomly splitting the data, points requiring fewer splits are likely outliers.
      • Local Outlier Factor (LOF): Identifies anomalies based on their local density compared to their neighbors.
    • Other techniques:
      • Independent component analysis (ICA): Extracts statistically independent signals from mixed data.
      • Topic modeling: Identifies latent topics or themes within a collection of documents.
      • Markov chain Monte Carlo (MCMC): Simulates the distribution of the data to explore possible patterns and relationships.

2. Supervised Learning: Both Independent and Dependent Variables

This method involves providing the model with both the data and its corresponding labels (dependent variables). This allows the model to learn the mapping between inputs and outputs, enabling it to:

  • Predict Categorical Outcomes: Solve “yes/no” or “this or that” problems, like predicting loan defaults, heart attacks, election wins, cricket match outcomes, or cancer diagnoses.
  • Predict Continuous Values: Estimate numerical outcomes, like stock prices, based on historical trends and market conditions.
  • Techniques Examples:
    • Classification:
      • Logistic regression: Models the probability of a binary outcome based on input features.
      • Decision trees: Split data into segments based on decision rules, assigning classes to each segment.
      • Support vector machines (SVM): Find optimal hyperplanes that separate data points into different classes.
      • K-nearest neighbors (KNN): Classifies data points based on the class of their K nearest neighbors.
      • Random forests: Ensemble method combining multiple decision trees for improved accuracy and robustness.
    • Regression:
      • Linear regression: Identifies linear relationships between variables to predict continuous outcomes.
      • Ridge regression and Lasso: Regularized linear regression models that penalize model complexity to prevent overfitting.
      • Polynomial regression: Captures non-linear relationships by transforming variables into polynomial terms.
      • Neural networks: Complex models inspired by the human brain, capable of learning complex relationships from large datasets.
    • Other techniques:
      • Naive Bayes: Probabilistic classifier based on Bayes’ theorem, assuming independence between features.
      • Gradient boosting: Ensemble method that combines weak learners sequentially to improve accuracy.
      • Ensemble learning: Combining multiple models of different types for improved performance and generalizability.

3. Reinforcement Learning: Beyond Variables, Learning Through Rewards and Penalties

This unique approach operates without any pre-defined variables. It focuses on an agent interacting with its environment, receiving rewards for desirable actions and penalties for undesirable ones. This allows the agent to learn and adapt its behavior to achieve its goal. Reinforcement learning resembles a child’s learning process, through trial and error, observation, and feedback. Reinforcement learning (RL) takes a different approach compared to supervised and unsupervised learning techniques. In RL, an agent interacts with an environment and learns through trial and error, maximizing its reward in the long run. It has different applications like:

  • Game playing: RL has achieved success in games like Go and StarCraft, demonstrating its ability to learn complex strategies.
  • Robotics and control systems: RL can be used to train robots to navigate environments, manipulate objects, and make optimal decisions.
  • Resource management and optimization: RL can optimize resource allocation in systems like energy grids and traffic networks.
  • Recommendation systems and personalization: RL can personalize recommendations for users based on their past interactions and preferences.
  • Basic Components:
    • Agent: The entity that interacts with the environment and makes decisions.
    • Environment: The world the agent operates in, providing feedback through rewards and penalties.
    • Action: The agent’s choices or strategies to influence the environment.
    • State: The representation of the environment the agent perceives at any given time.
    • Reward: Positive feedback indicating a desirable outcome for the agent.
    • Policy: The mapping between states and actions, defining the agent’s behavior.
  • Techniques Examples:
    • Common Techniques:
      • Q-learning: Estimates the optimal action-value function (Q-value) for each state-action pair, guiding the agent towards reward maximization.
      • Policy Gradient methods: Learn the optimal policy directly by updating its parameters based on the expected reward gradient.
      • Deep Q-learning: Combines Q-learning with deep neural networks to handle complex environments with high-dimensional state spaces.
      • Actor-Critic methods: Use two neural networks – one for policy evaluation (critic) and one for policy improvement (actor) – for faster learning and better convergence.
      • Exploration vs. Exploitation: Striking a balance between trying new actions to discover better rewards and exploiting known beneficial actions to maximize immediate reward.
    • Advanced Techniques:
      • Multi-agent Reinforcement Learning (MARL): Extends RL to scenarios with multiple agents interacting with each other, requiring coordination and collaboration.
      • Hierarchical RL: Decomposes tasks into smaller sub-tasks, allowing the agent to learn both high-level strategies and low-level actions.
      • Off-policy learning: Learn from past experiences even if they were not generated by the current policy, improving data efficiency.

An Illustrative Example: The Autonomous Car

Imagine an autonomous car whose objective is to navigate safely in various traffic scenarios and road conditions. In this case, the car would receive:

  • Penalties: For accidents, traffic violations, and failures to reach the destination.
  • Rewards: For safe driving, adhering to traffic rules, and successful navigation.

Through this continuous cycle of rewards and penalties, the car learns to optimize its behavior and achieve its objective without requiring explicitly labeled data.

Summary:

  • Unsupervised Learning: Unlabeled data, independent variables only, used for pattern discovery and data manipulation.
  • Supervised Learning: Labeled data, both independent and dependent variables, predicts categorical or continuous outcomes.
  • Reinforcement Learning: No pre-defined variables, learns through interaction with the environment, receives rewards and penalties to optimize behavior.
Unsupervised LearningSupervised LearningReinforcement Learning
No dependent/target/response variableDependent/Targe/Response variables will be availableDependent variable not required
Independent/Explanatory/Predictor variables/Features will be availableIndependent/Explanatory/Predictor variables/Features will be availableIndependent variables not required
Input will be only independent variables.Input will be both dependent and independent variables.Input will be starting point and end point.
Learn by patternsLearn by patternsLearn by mistakes [penalty and rewards] where it will try all possible paths to maximize rewards and minimize penalty
Application:
1. Recommendation System
2. Feature Reduction
3. Market Basket Analysis
4. Preference Mapping
5. Customer Profiling & Segmentation
6. Outlier Detection
Application:
1. Credit Risk Modelling
2. Fraud Detection
3. Discriminant Analysis
4. Object Detection
5. Video Analysis
6. Any other prediction task
Application:
1. Robotics
2. Games
3. Automation
4. Autonomous Car
Techniques Example:
1. Cluster Analysis
2. Factor Analysis
3. Conjoint Analysis
4. Association Rules Mining [Apriori Algorithm]
5. Principal Component Analysis
Techniques Example:
1. Linear Regression
2. Logistic Regression
3. Support Vector Machine [SVM]
4. Decision Tree
5. Random Forest
6. Neural Network and its different flavors like CNN, RNN etc.
Techniques Example:
1. Q Learning
2. Deep Q Networks
3. Proximal Policy Optimization [PPO]
4. Boltzmann Exploration
5. State-Action-Reward-State-Action [SARSA]

For further exploration:

  • Watch the video:

By understanding these fundamental differences between supervised, unsupervised, and reinforcement learning, you can unlock the potential of each approach and choose the right tool for your specific tasks.