ALPHABET

h a r r y

Loading

AI Explained Simply: From A to Z in 5 Minutes

Glossary of 50 AI terms with examples and data:


A

  1. AI (Artificial Intelligence)
    Definition: Machines designed to mimic human intelligence.
    Example: Self-driving cars.
    Data: GPS coordinates, sensor data, road images.

  2. Algorithm
    Definition: A step-by-step procedure used for calculations or problem-solving.
    Example: QuickSort algorithm for sorting data.
    Data: Input list: [3, 1, 4, 1, 5], Output: [1, 1, 3, 4, 5].

  3. Artificial Neural Network (ANN)
    Definition: A system of algorithms designed to recognize patterns.
    Example: Image recognition systems.
    Data: Input image: "cat", Output: "cat" label.

  4. Autonomous System
    Definition: A system that operates without human intervention.
    Example: A drone that can deliver packages.
    Data: GPS coordinates, battery status, obstacle distance.

  5. Agent
    Definition: An entity that can observe and act within an environment.
    Example: A robot vacuum that detects obstacles and cleans a room.
    Data: Sensors detecting obstacles, movement commands.


B

  1. Backpropagation
    Definition: A method to update weights in a neural network to minimize errors.
    Example: Training a neural network for digit recognition.
    Data: Input: Image of "3", Expected output: 3, Prediction: 5. Error is used to adjust the weights.

  2. Bias
    Definition: A systematic error in the model due to incorrect assumptions or data.
    Example: Gender bias in hiring algorithms.
    Data: A dataset with more resumes from men than women.

  3. BERT (Bidirectional Encoder Representations from Transformers)
    Definition: A transformer-based model for NLP tasks.
    Example: Sentiment analysis on tweets.
    Data: Input: "I love this movie", Output: Positive sentiment.


C

  1. Classification
    Definition: Categorizing data into predefined labels.
    Example: Email spam detection.
    Data: Emails labeled as "spam" or "not spam."

  2. Clustering
    Definition: Grouping similar data points together without labels.
    Example: Customer segmentation.
    Data: Customer purchases:

    • Customer A: $50, $70
    • Customer B: $300, $500
    • Customer C: $20, $40
      Clustering groups customers A and C as "budget shoppers."
  3. Computer Vision
    Definition: Teaching machines to interpret and process visual data.
    Example: Face recognition.
    Data: Image dataset labeled with names (Person A, Person B).

  4. Convolutional Neural Network (CNN)
    Definition: A deep neural network used mainly for image classification.
    Example: Recognizing objects in images.
    Data: Image of a dog: Input: pixels, Output: "dog."

  5. Confusion Matrix
    Definition: A table used to evaluate classification performance.
    Example: In a spam filter model, showing true positives (correct spam), false positives (non-spam marked as spam), etc.
    Data: True positives: 30, False positives: 5, False negatives: 2, True negatives: 50.


D

  1. Deep Learning
    Definition: A subset of machine learning with multiple layers of neural networks.
    Example: Image classification using deep networks.
    Data: Labeled images of animals: Input: Image, Output: Animal type.

  2. Decision Tree
    Definition: A model that splits data into branches based on feature values.
    Example: Predicting loan approval based on income and credit score.
    Data:

    • Income: $40k
    • Credit Score: 700
    • Prediction: Approved.
  3. Data Mining
    Definition: Discovering patterns in large datasets.
    Example: Identifying customer buying patterns.
    Data: Transaction data: Customer 1: $100, Customer 2: $200.

  4. Dataset
    Definition: A collection of data used to train or test machine learning models.
    Example: Iris flower dataset for classification.
    Data: Features: Petal length, petal width, Output: Flower species.

  5. Dimensionality Reduction
    Definition: Reducing the number of features in a dataset while retaining essential information.
    Example: Using PCA to reduce features in a high-dimensional dataset.
    Data: High-dimensional data points: (1, 2, 3, 4, 5) → Reduced to (2, 3).

  6. Dropout
    Definition: A regularization technique to prevent overfitting by randomly dropping neurons during training.
    Example: Used in deep neural networks for training.
    Data: During training, some neurons are ignored to avoid overfitting.


E

  1. Ensemble Learning
    Definition: Combining multiple models to improve performance.
    Example: Random Forest combining multiple decision trees.
    Data: 100 decision trees each making a prediction and the majority vote is taken.

  2. Epoch
    Definition: A full pass through the entire training dataset during training.
    Example: Training a neural network on 1000 samples for 10 epochs.
    Data: A dataset of 1000 samples.

  3. Exploratory Data Analysis (EDA)
    Definition: Analyzing datasets to summarize their main characteristics.
    Example: Visualizing a dataset of customer ages.
    Data: Customer ages: [25, 30, 35, 40, 45]. Visualize age distribution.


F

  1. Feature
    Definition: An individual variable used as input for machine learning models.
    Example: "Age" and "Income" can be features for predicting loan approval.
    Data: Age = 30, Income = 50000.

  2. Feature Engineering
    Definition: The process of transforming raw data into features that can be used for machine learning.
    Example: Creating a "age group" feature from age data.
    Data: Age = 25 → Age Group = "20-30."

  3. F1 Score
    Definition: A metric for classification performance, balancing precision and recall.
    Example: A model with high precision and recall will have a high F1 score.
    Data: Precision = 0.9, Recall = 0.8, F1 Score = 2 * (0.9 * 0.8) / (0.9 + 0.8) = 0.84.


G

  1. Gradient Descent
    Definition: An optimization algorithm used to minimize loss in machine learning.
    Example: Used to train a neural network by updating weights.
    Data: A neural network with weights [0.2, 0.5]; update weights using gradient descent.

  2. Generative Adversarial Network (GAN)
    Definition: A deep learning framework consisting of two networks: a generator and a discriminator.
    Example: Generating realistic images from random noise.
    Data: Input: Random noise, Output: Fake image of a cat.

  3. Gradient Boosting
    Definition: A machine learning technique that builds an ensemble of weak learners in a sequential manner.
    Example: XGBoost algorithm for predictive modeling.
    Data: Training data with features like age, income for predicting credit approval.


H

  1. Hyperparameter
    Definition: Parameters that control the training process of a machine learning model.
    Example: Learning rate and batch size in a neural network.
    Data: Learning rate = 0.01, Batch size = 32.

  2. Hyperparameter Tuning
    Definition: The process of finding the best hyperparameters for a model.
    Example: Using grid search to find the best learning rate and number of trees for a decision tree.
    Data: Hyperparameters tested: learning rates of 0.1, 0.01, and 0.001.


I

  1. Inference
    Definition: The process of making predictions using a trained machine learning model.
    Example: Using a trained model to predict if an email is spam.
    Data: Input: Email text, Output: Spam or Not Spam.

  2. Imbalanced Dataset
    Definition: A dataset where certain classes or categories are underrepresented.
    Example: Predicting rare diseases where only 5% of patients

have the disease.
Data: Disease = "Cancer" 5%, Disease = "Healthy" 95%.


J

  1. Jaccard Similarity
    Definition: A metric to measure similarity between two sets.
    Example: Comparing two documents to see how similar they are.
    Data: Set A = {cat, dog}, Set B = {dog, bird}. Jaccard similarity = 1 / 3.

K

  1. K-Nearest Neighbors (KNN)
    Definition: A machine learning algorithm that classifies data based on the majority class of its nearest neighbors.
    Example: Classifying a point as "cat" or "dog" based on its nearest neighbors.
    Data: Neighbors of point X are mostly labeled as "cat."

  2. K-Means Clustering
    Definition: An algorithm for partitioning data into K clusters based on distance metrics.
    Example: Grouping customers into clusters based on purchasing behavior.
    Data: Customer data: [low spender, high spender, mid spender].


L

  1. Logistic Regression
    Definition: A regression model used for binary classification tasks.
    Example: Predicting whether a customer will buy a product (Yes/No).
    Data: Features: age, income, and past purchases.

  2. LSTM (Long Short-Term Memory)
    Definition: A type of recurrent neural network designed to remember long-term dependencies.
    Example: Predicting the next word in a sentence.
    Data: Input: "I am going to", Output: "the store."


M

  1. Model Overfitting
    Definition: When a model learns too much from training data, capturing noise instead of general patterns.
    Example: A decision tree that perfectly classifies the training data but performs poorly on new data.
    Data: High accuracy on training data, low accuracy on test data.

  2. Machine Learning
    Definition: A subset of AI where systems learn from data to improve their performance.
    Example: Predicting house prices based on features like size and location.
    Data: Features: size, bedrooms, location; Target: price.

  3. Matrix Factorization
    Definition: A technique used for reducing the dimensionality of data, often used in recommendation systems.
    Example: Netflix using matrix factorization to recommend movies.
    Data: User ratings for movies.


N

  1. Naive Bayes
    Definition: A probabilistic classifier based on Bayes’ theorem.
    Example: Classifying email as spam or not based on word frequency.
    Data: Words in email: ["win", "free", "money"] → Predicted class = Spam.

  2. Neural Network
    Definition: A computational model inspired by the human brain used for machine learning tasks.
    Example: A neural network used for handwriting recognition.
    Data: Image of handwritten "3", Output: Predicted label = "3."


O

  1. Overfitting
    Definition: A modeling error where the model learns the details and noise in the training data to the detriment of its performance on new data.
    Example: A model that memorizes training data but cannot generalize well to unseen data.
    Data: Training accuracy 95%, testing accuracy 60%.

P

  1. Principal Component Analysis (PCA)
    Definition: A technique for reducing the number of features in a dataset while preserving as much variance as possible.
    Example: Reducing dimensions in a dataset with hundreds of features.
    Data: Input data with 100 features → Reduced to 2 dimensions.

  2. Precision
    Definition: A metric that evaluates the accuracy of positive predictions.
    Example: In spam classification, precision measures how many of the predicted spam emails are actually spam.
    Data: True positives: 20, False positives: 5, Precision = 20 / (20 + 5) = 0.8.

  3. Reinforcement Learning
    Definition: A machine learning technique where an agent learns by interacting with its environment and receiving feedback.
    Example: A robot learning to play chess through trial and error.
    Data: State: Chessboard configuration, Action: Move piece, Reward: Win or lose.


Q

  1. Q-Learning
    Definition: A reinforcement learning algorithm that learns the value of actions in a given state.
    Example: A robot learning to navigate a maze.
    Data: State: Position in maze, Action: Move direction, Reward: Reached the goal.

R

  1. Recurrent Neural Network (RNN)
    Definition: A neural network designed for sequential data processing.
    Example: Predicting the next word in a sentence.
    Data: Input: "I am", Output: "going".

  2. Random Forest
    Definition: An ensemble learning method using many decision trees.
    Example: Predicting customer churn using various decision trees.
    Data: Customer data with features like age, income, and churn status.


S

  1. Support Vector Machine (SVM)
    Definition: A supervised learning algorithm used for classification and regression tasks.
    Example: Classifying emails into "spam" and "not spam."
    Data: Features: word frequencies, Labels: "spam" or "not spam."

This is a comprehensive glossary of 50 common AI terms, with examples and data to provide better context for their applications.