AI Practitioner Glossary
A retro-futuristic sci-fi control room powered by analog machines in a dystopian yet utopian fusion style. The room is filled with glowing cathode ray tube monitors, flickering scanner displays, oscilloscopes, and vacuum tube computers. The atmosphere is dim, with deep shadows and hazy light from orange and green neon lights. A central terminal projects a glitchy hologram of a digital glossary entry, partially corrupted. Surveillance drones hover near the ceiling, their red lenses blinking. The walls are covered in peeling posters of utopian AI propaganda. The scene combines analog textures with digital decay, evoking a noir surveillance vibe in a forgotten AI facility.
48
Evaluation Metric
A standard to assess model performance.
Real-world use:
Used to compare regression models via RMSE, classification models via F1-score, or ranking systems via MAP scores.
56
F-measure (F1 Score)
The harmonic mean of precision and recall.
Real-world use:
Used in binary classification evaluation when you need to balance precision and recall, especially in imbalanced datasets like fraud detection.
Common mistake:
Using F1-score on highly imbalanced datasets where other metrics like AUC-ROC might be more appropriate.
61
GAN (Generative Adversarial Network)
A neural network framework with two models — generator and discriminator — competing to improve outputs.
Real-world use:
Used to generate realistic human faces from random noise, create synthetic training data, and develop deepfake technology for entertainment.
69
Hidden Layer
A layer in a neural network between input and output layers.
Real-world use:
Used to extract abstract features from input data, enabling deep networks to learn complex patterns in image recognition and natural language processing.
62
Generalization
A model's ability to perform well on unseen data.
Real-world use:
Used to evaluate real-world effectiveness of trained models, ensuring they work on new customers, images, or situations not seen during training.
Common mistake:
Confusing training performance with generalization ability, leading to overconfident models that fail in production.
66
Ground Truth
The actual labels used to compare model predictions against.
Real-world use:
Used to evaluate accuracy in image classification tasks, comparing model predictions against expert-verified diagnoses in medical imaging.
73
Imbalanced Dataset
A dataset where some classes are over- or under-represented.
Real-world use:
Used to describe fraud detection data with few fraud cases (1%) versus legitimate transactions (99%), requiring special handling techniques.
Common mistake:
Using accuracy as the primary metric for imbalanced datasets, which can be misleading due to class distribution.
67
Hashing Trick
A technique to convert categorical variables into fixed-length vectors.
Real-world use:
Used in scalable text classification where vocabulary size is huge, mapping words to fixed-size feature vectors for efficient processing.
68
Heuristic
A problem-solving approach using practical methods, not guaranteed to be optimal.
Real-world use:
Used in AI for game strategy estimation, route planning algorithms, and quick decision-making when optimal solutions are computationally expensive.
74
Imputation
Filling in missing data values.
Real-world use:
Used to handle nulls in medical record datasets by replacing missing values with statistical estimates like mean, median, or predicted values.
Common mistake:
Using simple mean imputation for all missing data without considering the underlying patterns or reasons for missingness.
75
Inductive Learning
Learning from labeled data to generalize for new data.
Real-world use:
Used in standard supervised machine learning, training models on historical sales data to predict future sales patterns.
76
Information Gain
A metric to decide feature splits in decision trees.
Real-world use:
Used in building classification trees, selecting which feature to split on based on how much it reduces uncertainty in the target variable.
77
Instance-Based Learning
Learning that memorizes training instances rather than generalizing.
Real-world use:
Used in k-nearest neighbors (KNN) for recommendation systems, finding similar users or products based on stored historical data.
78
Interquartile Range (IQR)
A measure of statistical dispersion between the 25th and 75th percentiles.
Real-world use:
Used in detecting outliers in data by identifying values that fall outside 1.5 × IQR from the quartiles, common in financial data analysis.
Core Algorithms
85
k-Means
A clustering algorithm that partitions data into k clusters.
Real-world use:
Used in market segmentation to group customers based on purchasing behavior, demographics, and preferences for targeted marketing campaigns.
Common mistake:
Choosing k arbitrarily without using methods like the elbow method to determine optimal cluster number.
86
k-Nearest Neighbors (KNN)
A classification method based on closest training examples.
Real-world use:
Used in handwriting recognition, recommendation systems, and image classification by finding the most similar training examples.
94
Linear Regression
A method to model the relationship between variables with a straight line.
Real-world use:
Used to predict housing prices based on features like square footage, location, and number of bedrooms using a linear relationship.
95
Logistic Regression
A model for binary classification problems.
Real-world use:
Used to predict if a customer will churn, whether an email is spam, or if a patient has a disease based on input features.
97
LSTM (Long Short-Term Memory)
A type of RNN good at remembering long sequences.
Real-world use:
Used in speech recognition and time series forecasting, handling sequences where long-term dependencies matter, like stock price prediction.
93
LeNet
One of the earliest convolutional neural networks.
Real-world use:
Used in digit recognition tasks, pioneering the application of CNNs for handwritten digit classification in postal services.
83
JSON (JavaScript Object Notation)
A lightweight data format used for data exchange.
Real-world use:
Used to structure data in REST APIs for ML apps, enabling seamless data transfer between web services and machine learning models.
84
k-Fold Cross-Validation
A technique that divides data into k parts and trains/testing k times.
Real-world use:
Used in robust model evaluation to ensure models generalize well, typically using 5-fold or 10-fold validation in machine learning competitions.
87
Kernel Function
A function used in SVM to enable non-linear classification.
Real-world use:
Used in separating data that isn't linearly separable, like classifying images or text where complex decision boundaries are needed.
89
Label
The ground-truth outcome associated with a data point.
Real-world use:
Used in supervised learning for prediction targets, like "spam" or "not spam" for emails, or house prices for real estate data.
90
Labeled Data
Data that includes both input and known output values.
Real-world use:
Used to train supervised learning models, like image datasets with correct classifications or customer data with churn outcomes.
91
Latent Variable
A variable that is not directly observed but inferred from other variables.
Real-world use:
Used in topic modeling of documents, where hidden topics are inferred from word patterns, or in customer segmentation based on purchasing behavior.
92
Learning Rate
The step size used during optimization.
Real-world use:
Used to control speed of training in neural networks, balancing between fast convergence and stable learning (typically 0.001-0.1).
Common mistake:
Setting learning rate too high (causing instability) or too low (causing extremely slow training).
96
Loss Function
A function that measures model error.
Real-world use:
Used to train models by minimizing error, like mean squared error for regression or cross-entropy loss for classification tasks.
121
Parameter
A variable that the model learns during training.
Real-world use:
Used in adjusting weights in neural networks, determining how strongly each input feature influences the final prediction in image recognition or text classification.
122
PCA (Principal Component Analysis)
A dimensionality reduction technique that transforms features into components.
Real-world use:
Used to visualize high-dimensional data, reducing thousands of gene expression features to 2-3 components for cancer research visualization.
123
Perceptron
The simplest type of neural network unit.
Real-world use:
Used in early pattern recognition, serving as the foundation for modern neural networks and linear classification tasks.
125
Pipeline
A sequence of data processing components.
Real-world use:
Used to streamline ML workflows, automatically processing data from cleaning through feature engineering to model training and prediction.
129
Preprocessing
Data preparation steps before training a model.
Real-world use:
Used to remove noise from sensor data, normalize features, handle missing values, and encode categorical variables before model training.
128
Predictive Modeling
Using data to build a model that can predict outcomes.
Real-world use:
Used in credit scoring and churn prediction, helping banks assess loan risk and companies identify customers likely to cancel subscriptions.
137
Sampling
Selecting a subset of data points from a larger set.
Real-world use:
Used in reducing dataset size for training when working with massive datasets, or creating representative samples for statistical analysis.
124
Performance Metric
A quantitative measure used to assess model performance.
Real-world use:
Used in comparing models (e.g., F1, accuracy), enabling data scientists to select the best performing model for production deployment.
126
Poisson Distribution
A distribution that models rare events in a fixed interval.
Real-world use:
Used in predicting server failures or call arrivals, modeling events that occur independently at a constant average rate.
127
Precision
The ratio of true positives to predicted positives.
Real-world use:
Used when false positives are costly, like in spam filtering where marking legitimate emails as spam frustrates users.
Common mistake:
Optimizing for precision alone without considering recall, potentially missing important positive cases.
130
Probability
A measure of the likelihood of an event.
Real-world use:
Used in probabilistic forecasting, risk assessment, and uncertainty quantification in machine learning predictions.
131
Recall
The ratio of true positives to all actual positives.
Real-world use:
Used when missing a positive is costly (e.g., disease detection), ensuring most actual cases are identified even if some false positives occur.
Common mistake:
Confusing recall with precision or not understanding the precision-recall tradeoff in model optimization.
133
Regression
A predictive modeling technique for continuous outcomes.
Real-world use:
Used to forecast revenue, predict house prices, estimate sales figures, and other continuous numerical predictions.
135
Residual
The difference between predicted and actual values.
Real-world use:
Used in diagnosing regression models, analyzing residual patterns to identify model assumptions violations or areas for improvement.
136
ROC Curve
A graph showing the true positive rate vs. false positive rate.
Real-world use:
Used to evaluate binary classifiers across different threshold settings, helping optimize the tradeoff between sensitivity and specificity.
139
Sensitivity
Another term for recall.
Real-world use:
Used in medical diagnosis testing to measure how well a test identifies patients who actually have the disease.
140
SGD (Stochastic Gradient Descent)
An optimization algorithm that updates weights for each data sample.
Real-world use:
Used to train deep learning models efficiently, updating parameters more frequently than batch gradient descent for faster convergence.
141
Standard Deviation
A measure of data spread around the mean.
Real-world use:
Used in risk modeling to quantify volatility in financial markets, portfolio management, and quality control processes.
143
Support Vector Machine (SVM)
A classifier that finds the optimal separating hyperplane.
Real-world use:
Used in text classification, image recognition, and bioinformatics where clear decision boundaries between classes are important.
Core Algorithms
132
Recurrent Neural Network (RNN)
A neural network that processes sequences by looping over data.
Real-world use:
Used in time series forecasting and language modeling, processing sequential data like stock prices, weather patterns, or natural language text.
Common mistake:
Using basic RNNs for very long sequences where vanishing gradient problems make LSTM or GRU more appropriate.
Learning Paradigms
134
Reinforcement Learning
A learning paradigm where agents learn by reward and punishment.
Real-world use:
Used in robotics and game AI, enabling systems to learn optimal strategies through trial and error, like AlphaGo or autonomous vehicle navigation.
138
Semi-Supervised Learning
Training a model on a small labeled set and a large unlabeled set.
Real-world use:
Used when labeling data is expensive, like medical image analysis where expert annotations are costly but raw images are abundant.
142
Supervised Learning
Training a model using labeled data.
Real-world use:
Used in image classification, spam detection, and medical diagnosis where models learn from examples with known correct answers.
145
Underfitting
When a model is too simple to capture patterns in the data.
Real-world use:
Seen when both training and test error are high, like using linear regression for clearly non-linear relationships in stock price prediction.
Common mistake:
Assuming more complexity always improves performance without considering the underlying data patterns.
NLP & Text Processing
144
Tokenization
Splitting text into individual words or symbols.
Real-world use:
Used in NLP preprocessing to break down sentences into analyzable units for sentiment analysis, machine translation, and chatbot development.
Metrics & Measures
81
Jaccard Similarity
A statistic used for comparing the similarity of sample sets.
Real-world use:
Used in text analysis and clustering to measure similarity between documents based on shared words or in recommendation systems for user similarity.
82
Joint Probability
The probability of two events occurring together.
Real-world use:
Used in Naive Bayes classifiers to calculate the probability of multiple features occurring together in spam detection or medical diagnosis.
88
Kurtosis
A measure of the "tailedness" of the probability distribution.
Real-world use:
Used in analyzing stock return distributions to understand risk, where high kurtosis indicates more extreme price movements.
98
Manhattan Distance
A distance metric based on grid-like movement.
Real-world use:
Used in KNN when measuring city-block differences, like calculating taxi distances in urban route planning or clustering categorical data.
99
Marginal Probability
The probability of a single event occurring.
Real-world use:
Used in Bayes' theorem calculations for spam filtering, medical diagnosis, and other probabilistic models to understand individual event likelihood.
100
Mean Absolute Error (MAE)
The average of absolute differences between predictions and actual values.
Real-world use:
Used to evaluate regression models in sales forecasting, providing an interpretable measure of average prediction error in original units.
Optimization & Training
63
Gradient Descent
An optimization algorithm to minimize loss by updating model weights.
Real-world use:
Used in training all neural networks, iteratively adjusting weights to minimize prediction errors in tasks from image recognition to language translation.
65
Grid Search
A method to find the best combination of hyperparameters.
Real-world use:
Used to optimize model settings for maximum accuracy by systematically testing combinations of learning rates, regularization values, and network architectures.
70
Hyperparameter
A configuration value set before training a model.
Real-world use:
Used to tune learning rate, batch size, number of layers, and regularization strength to optimize model performance for specific tasks.
71
Hyperparameter Tuning
The process of choosing the best hyperparameters for a model.
Real-world use:
Used in improving model performance by systematically testing different configurations to find optimal settings for specific datasets and tasks.
80
Iteration
A single update step during model training.
Real-world use:
Used in each step of gradient descent, where the model processes a batch of data and updates weights based on the calculated error.
Specialized Applications
64
Graph Neural Network (GNN)
A neural network designed to operate on graph structures.
Real-world use:
Used in social network analysis and recommendation engines, analyzing relationships between users, products, or molecular structures.
72
Image Classification
The task of assigning a label to an image.
Real-world use:
Used in identifying diseases in medical imaging, quality control in manufacturing, and content moderation on social media platforms.
79
Intersection over Union (IoU)
A metric to evaluate object detection accuracy.
Real-world use:
Used in comparing predicted and ground-truth bounding boxes in autonomous vehicle systems to measure how accurately objects are detected.
Techniques & Methods
49
Exploratory Data Analysis (EDA)
The process of summarizing the main characteristics of data.
Real-world use:
Used to detect patterns and anomalies visually through histograms, scatter plots, and correlation matrices before building models.
50
Extrapolation
Predicting beyond the range of observed data.
Real-world use:
Used in forecasting future sales based on historical trends, predicting population growth, or estimating stock prices beyond training data range.
Common mistake:
Extrapolating too far beyond training data range, leading to unreliable predictions due to unseen patterns.
51
Feature
A measurable input property of the data.
Real-world use:
Used as an input in predictive models, like age, income, and location in house price prediction or pixel values in image classification.
52
Feature Engineering
Creating or modifying features to improve model performance.
Real-world use:
Used to extract date parts from timestamps (day, month, year) for sales forecasting or creating interaction features for better predictions.
53
Feature Importance
A score indicating how much a feature influences prediction.
Real-world use:
Used to interpret tree-based models, identifying which factors most influence loan approvals or medical diagnoses for stakeholder understanding.
54
Feature Selection
Choosing the most relevant features for a model.
Real-world use:
Used to reduce overfitting and improve model speed by selecting only the most predictive variables from hundreds of potential features.
57
Fine-Tuning
Adjusting a pretrained model for a new task.
Real-world use:
Used to adapt BERT for sentiment analysis or fine-tune image classification models for specific domains like medical imaging.
59
Frequency Encoding
A method of encoding categorical data using frequency counts.
Real-world use:
Used in converting nominal values for ML models, replacing city names with their occurrence frequency in the dataset for better model performance.
60
Function Approximation
Estimating an unknown function that best fits input/output pairs.
Real-world use:
Used in regression modeling to approximate the relationship between house features and prices, or between advertising spend and sales revenue.