Machine learning acronyms and abbreviations 🤖

A comprehensive list of ML and AI acronyms and abbreviations. Feel free to ⭐ it!

Machine learning is rapidly growing, creating more mysterious acronyms and abbreviations that might be challenging to follow, especially for beginners. This abbreviations list was created when I collected all acronyms from my Ph.D. thesis. Surprised by the enormous number, I searched through the web to copy and paste them to save time on writing. I found a few lists, but none covered all I needed. I decided to gather all this info in a single Table to make it easier to fellow ML enthusiasts.

Sources 📖

Contributing 📝

Feel free to:

add any ML-related abbreviation,
add the definition alone,
add an issue.

Currently, ~30% of abbreviations have descriptions, so feel free to add them! It should be a brief and concise one-liner rather than explain the whole subject. The purpose is to quickly find the meaning of an abbreviation, and the given definition helps to understand if it matches the context. Abbreviations should be in alphabetical order.

I have added a link to the online doc with all abbreviations to make it easier for you to contribute. Feel free to add a new one and sort the table automatically. You can copy the table from Google Sheets to the markdown table generator: https://www.tablesgenerator.com/markdown_tables.

The list 📑

Acronym	Description	Definition
ACC	ACCuracy	Accuracy is a metric for evaluating classification models.
ACE	Alternating conditional expectation (ACE) algorithm	An algorithm to find the optimal transformations between the response variable and predictor variables in regression analysis.
ADA	AdaBoosted Decision Trees	Using AdaBoost to improve performance in decision trees.
AdaBoost	Adaptive Boosting	A statistical classification meta-algorithm that can be used in conjunction with many other types of learning algorithms to improve performance.
AdR	AdaBoostRegressor	Using AdaBoost to improve performance in regression.
ADT	Automatic Drum Transcription	Methods that aim to detect drum events in polyphonic music
AE	AutoEncoder	A type of artificial neural network used to learn efficient codings of unlabeled data (unsupervised learning)
AGI	Artificial General Intelligence	The hypothetical ability of an intelligent agent to understand or learn any intellectual task that a human being can
AI	Artificial Intelligence	The simulation of human intelligence in machines that are programmed to think like humans and mimic their actions.
AIWPSO	Adaptive Inertia Weight Particle Swarm Optimization	An optimization algorithm using an individual search ability (ISA) to indicate whether each particle lacks global exploration or local exploitation abilities in each dimension.
AM	Activation Maximization	A method to visualize neural networks and aims to maximize the activation of certain neurons.
AMT	Automatic Music Transcription	Computational algorithms that convert acoustic music signals into some form of music notation
ANN	Artificial Neural Network	A collection of connected computational units or nodes called neurons arranged in multiple computational layers.
AR	Augmented Reality	An interactive experience of a real-world environment where the objects that reside in the real world are enhanced by computer-generated perceptual information sometimes across multiple sensory modalities.
ARNN	Anticipation Recurrent Neural Network	A type of RNN designed to predict future inputs or states in sequential data.
AUC	Area Under the (ROC) Curve	Probability of confidence in a model to accurately predict positive outcomes for actual positive instances
BDT	Boosted Decision Tree	An ensemble learning method combining multiple decision trees, typically using boosting algorithms like AdaBoost or Gradient Boosting.
BERT	Bidirectional Encoder Representation from Transformers	Commonly used transformer-based language model.
BiFPN	Bidirectional Feature Pyramid Network	An efficient multi-scale feature fusion method used in object detection, allowing bidirectional (top-down and bottom-up) information flow.
BILSTM	Bidirectional Long Short-Term Memory	A bidirectional recurrent neural network architecture utilizing LSTM units (see LSTM).
BLEU	Bilingual Evaluation Understudy	A score of the effectiveness of translating one language into another one.
BN	Bayesian Network	A probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG).
BNN	Bayesian Neural Network	A type of artificial neural network built by introducing random variations into the network either by giving the network's artificial neurons stochastic transfer functions or by giving them stochastic weights
BP	BackPropagation	A widely used algorithm for training feedforward neural networks by propagating errors backward through the network.
BPMF	Bayesian Probabilistic Matrix Factorization	A probabilistic approach to matrix factorization, often used in recommender systems, incorporating Bayesian inference.
BPTT	Backpropagation Through Time	A gradient-based technique for training certain types of recurrent neural networks (e.g., LSTMs) by unrolling the network through time steps.
BQML	Big Query Machine Learning	Google Cloud service enabling creation and execution of ML models in BigQuery using standard SQL queries.
BRNN	Bidirectional Recurrent Neural Network	An RNN variant that processes sequence data in both forward and backward directions, capturing context from past and future elements.
BRR	Bayesian Ridge Regression	A regression technique that incorporates Bayesian methods with Ridge Regression (L2 regularization).
CAE	Contractive AutoEncoder	An autoencoder variant that adds a penalty term to the loss function to encourage robustness of the learned representation to small input variations.
CALA	Continuous Action-set Learning Automata	A type of reinforcement learning agent operating in environments with continuous (non-discrete) action spaces.
CART	Classification And Regression Tree	An algorithm used to build decision trees for both classification and regression tasks by recursively partitioning the data space.
CAV	Concept Activation Vectors	Explainability method that provides an interpretation of a neural net's internal state in terms of human-friendly concepts.
CBI	Counterfactual Bias Insertion	A technique potentially used in fairness research to test model robustness against specific biases by inserting counterfactual examples.
CBOW	Continuous Bag of Words	A neural network model architecture (part of Word2Vec) used for learning word embeddings by predicting a target word from its surrounding context words.
CDBN	Convolutional Deep Belief Networks	A type of deep artificial neural network composed of multiple layers of convolutional restricted Boltzmann machines stacked together.
CE	Cross-Entropy	A common loss function used in classification tasks, measuring the difference between predicted probability distributions and the true distribution.
CEC	Constant Error Carousel	A key component within LSTM units that allows error signals to propagate back through time without vanishing or exploding gradient issues.
CF	Collaborative Filtering	Technique used in recommendation systems predicting user preferences based on patterns from similar users or items.
CLNN	ConditionaL Neural Networks	Neural networks whose output or internal processing is dependent on an auxiliary conditional input.
CMAC	Cerebellar Model Articulation Controller	A type of neural network inspired by the mammalian cerebellum, often used for function approximation and control tasks, using associative memory principles.
CMMs	Conditional Markov Model	A graphical model for sequence labeling that combines features of hidden Markov models (HMMs) and maximum entropy (MaxEnt) models. Also known as maximum-entropy Markov model (MEMM).
CNN	Convolutional Neural Network	A class of artificial neural network (ANN), typically using convolutional layers, most commonly applied to analyze visual imagery.
ConvNet	Convolutional Neural Network	A class of artificial neural network (ANN), typically using convolutional layers, most commonly applied to analyze visual imagery. (Synonym for CNN)
CRBM	Conditional Restricted Boltzmann Machine	An extension of the Restricted Boltzmann Machine where the visible and/or hidden units are conditioned on additional input variables.
CRFs	Conditional Random Fields	A class of statistical modeling methods often used for structured prediction tasks like sequence labeling (e.g., in NLP), modeling conditional probabilities.
CRNN	Convolutional Recurrent Neural Network	A hybrid neural network architecture combining Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), typically for spatio-temporal data.
CTC	Connectionist Temporal Classification	A loss function used for training sequence models (like RNNs) on tasks where the alignment between input and output sequences is variable or unknown (e.g., speech).
CTR	Collaborative Topic Regression	A recommendation model that integrates collaborative filtering with topic modeling (like LDA) to leverage item content information.
CV	Coefficient of Variation	Intra-cluster similarity to measure the accuracy of unsupervised classification models based on clusters
CV	Computer Vision	A field of AI enabling computers to "see" and interpret information from digital images or videos.
CV	Cross Validation	Resampling method for training, validation and testing a model across different iterations on portions of the full data set.
CSLR	Continuous Sign Language Recognition	Sign language recognition and understanding (continuous using not only single words but whole phrases) getting knowledge about the meaning of signs essential for SLT.
DAAF	Data Augmentation and Auxiliary Feature	A technique possibly involving using auxiliary features alongside data augmentation to improve model training.
DAE	Denoising AutoEncoder or Deep AutoEncoder	An autoencoder trained to reconstruct clean input from corrupted versions (Denoising AE), often with multiple hidden layers (Deep AE).
DBM	Deep Boltzmann Machine	An undirected probabilistic graphical model (like RBM) with multiple layers of hidden variables, allowing for more complex representations.
DBN	Deep Belief Network	A generative graphical model composed of multiple layers of latent variables ("beliefs"), typically trained greedily layer-by-layer using RBMs.
DBSCAN	Density-Based Spatial Clustering of Applications with Noise	A density-based clustering algorithm that groups together points closely packed together, marking outliers as noise.
DCGAN	Deep Convolutional Generative Adversarial Network	A type of GAN that uses convolutional and convolutional-transpose layers in its discriminator and generator, respectively, primarily for image generation.
DCMDN	Deep Convolutional Mixture Density Network	Combines CNNs with Mixture Density Networks to model complex conditional probability distributions, often for image generation or regression tasks with uncertainty.
DE	Differential Evolution	A metaheuristic optimization algorithm belonging to the family of evolutionary algorithms, used for finding global optima, particularly in continuous spaces.
DeconvNet	DeConvolutional Neural Network	A neural network architecture often utilizing transposed convolutions (sometimes called deconvolutions) for tasks like image segmentation or visualization of CNN features.
DeepLIFT	Deep Learning Important FeaTures	An explainability method for deep learning models that attributes prediction differences to input feature differences based on a reference input.
DL	Deep Learning	A subfield of machine learning based on artificial neural networks with multiple layers (deep architectures) enabling learning of complex patterns.
DNN	Deep Neural Network	An artificial neural network (ANN) with multiple hidden layers between the input and output layers.
DQN	Deep Q-Network	A reinforcement learning algorithm that uses a deep neural network to approximate the Q-value (action-value) function.
DR	Detection Rate	Represents the sensitivity or detection rate of a model (synonym for True Positive Rate or Recall).
DSN	Deep Stacking Network	A deep learning architecture based on stacking blocks of simple modules (like MLPs) trained sequentially, layer by layer.
DT	Decision Tree	A supervised learning model using a tree-like structure of decisions and their possible consequences to classify or regress data.
DTD	Deep Taylor Decomposition	An explainability technique that decomposes the prediction of a neural network based on Taylor series expansion, related to Layer-wise Relevance Propagation (LRP).
DWT	Discrete Wavelet Transform	A mathematical transform used for signal processing and feature extraction, decomposing signals into different frequency components at multiple scales.
ELECTRA	Efficiently Learning an Encoder that Classifies Token Replacements Accurately	A transformer-based pre-training method that learns by distinguishing real input tokens from plausible fake tokens generated by another small network (discriminator task).
ELM	Extreme Learning Machine	A feedforward neural network training algorithm where hidden node parameters are randomly assigned and only output weights are learned analytically, often very fast.
ELMo	Embeddings from Language Models	Contextual word embedding technique generating deep, character-based representations that vary based on the sentence context.
ELU	Exponential Linear Unit	An activation function similar to ReLU but with negative values, which can help push mean activations closer to zero, potentially speeding up learning.
EM	Expectation maximization	An iterative method for finding maximum likelihood or MAP estimates of parameters in statistical models with latent (unobserved) variables.
EMD	Entropy Minimization Discretization	A method for discretizing continuous features by finding split points that minimize the class information entropy within the resulting intervals.
ERNIE	Enhanced Representation through kNowledge IntEgration	A transformer-based language model (often associated with Baidu) that incorporates external knowledge (e.g., knowledge graph facts) during pre-training.
ETL Pipeline	Extract Transform Load Pipeline	A data integration process involving extracting data from sources, transforming it into a proper format, and loading it into a target system (like a data warehouse).
EXT	Extremely Randomized Trees	An ensemble learning method similar to Random Forests, but introduces more randomness in selecting node splits (both attribute and split point).
F1 Score	Harmonic Precision-Recall Mean	The harmonic mean of precision and recall, used as a performance metric for classification tasks, especially with imbalanced datasets.
FALA	Finite Action-set Learning Automata	A type of reinforcement learning agent operating in environments with a finite number of discrete actions.
FC	Fully-Connected	Layers where all the inputs from one layer are connected to every activation unit of the next layer.
FC-CNN	Fully Convolutional Convolutional Neural Network	A neural network architecture consisting entirely of convolutional layers (and pooling/upsampling), without any fully-connected layers.
FC-LSTM	Fully Connected Long Short-Term Memory	An LSTM network where connections between time steps or layers might involve fully connected transformations, combining sequential and dense processing.
FCM	Fuzzy C-Means	A clustering algorithm allowing data points to belong to multiple clusters with varying degrees of membership (fuzziness).
FCN	Fully Convolutional Network	A neural network that only performs convolution (and subsampling or upsampling) operations, often used for semantic segmentation. (Similar to FC-CNN)
FFT	Fast Fourier transform	An efficient algorithm to compute the Discrete Fourier Transform (DFT) and its inverse, widely used in signal processing and feature engineering.
FLOP	Floating Point Operations	A unit of measure of the amount of mathematical computations (like additions, multiplications) often used to describe the complexity of a neural network model.
FLOPS	Floating Point Operations Per Second	A unit of measure of computer performance, indicating how many floating-point operations a processor can perform per second.
FNN	Feedforward Neural Network	An artificial neural network where connections between nodes do not form a cycle; information moves only forward from input to output layers.
FNR	False Negative Rate	Proportion of actual positives predicted as negatives (1 - Recall/TPR).
FPN	Feature Pyramid Network	A neural network component, common in object detection, that builds multi-scale feature representations with rich semantics at all levels via lateral connections.
FPR	False Positive Rate	Proportion of actual negatives predicted as positives.
FST	Finite state transducer	A finite automaton with two tapes (input and output), used for modeling sequence-to-sequence transformations (e.g., in NLP/speech).
FWIoU	Frequency Weighted Intersection over Union	Metric in segmentation/object detection tasks. Weighted average of IoU's over classes, where weights depend on class frequency.
GA	Genetic Algorithm	A metaheuristic optimization algorithm inspired by natural selection, using concepts like mutation, crossover, and selection to evolve solutions.
GALE	Global Aggregations of Local Explanations	An explainability technique that aims to derive global insights about a model's behavior by aggregating multiple local explanations (e.g., SHAP, LIME) from individual predictions.
GAM	Generalized Additive Model	A regression model where the output variable depends linearly on unknown smooth functions of predictor variables, allowing for non-linear relationships.
GAM	Global Attribution Mapping	An explainability method, often used with CNNs, to identify which input regions (e.g., pixels in an image) contribute most significantly to a specific output class.
GAMLSS	Generalized Additive Models for Location, Scale and Shape	An extension of GAMs allowing not just the mean (location) but also other distribution parameters (like scale/variance and shape/skewness) to be modeled with additive predictors.
GAN	Generative Adversarial Network	A deep-learning-based generative model using "indirect" training through the discriminator another neural network that is able to tell how much an input is "realistic" which itself is also being updated dynamically.
GAP	Global Average Pooling	A pooling operation often used in CNNs before the final classification layer, reducing each feature map to a single value by averaging, which helps reduce overfitting and enforces correspondence between feature maps and categories.
GBRCN	Gradient-Boosting Random Convolutional Network	A model likely combining gradient boosting techniques with randomly initialized convolutional features, possibly for time-series or image analysis.
GD	Gradient Descent	An optimization algorithm used to minimize some function by iteratively moving in the direction of steepest descent as defined by the negative of the gradient.
GEBI	Global Explanation for Bias Identification	Explainability method that aggregates local explanations (of single prediction) into a global explanation with the goal of finding biases and systematic errors in decision making.
GFNN	Gradient Frequency Neural Networks	Neural networks possibly designed to better learn or represent high-frequency components in data, potentially by manipulating gradients during training.
GLCM	Gray Level Co-occurrence Matrix	A statistical method for examining texture that considers the spatial relationship of pixels, used for feature extraction in image analysis.
Gloss2Text	A task of transforming raw glosses into meaningful sentences.	In sign language processing, the task of converting a sequence of sign glosses (word-level representations) into a grammatically correct spoken language sentence.
GloVE	Global Vectors	An unsupervised learning algorithm for obtaining vector representations for words, trained on aggregated global word-word co-occurrence statistics from a corpus.
GMM	Gaussian mixture model	A probabilistic model that assumes all the data points are generated from a mixture of a finite number of Gaussian distributions with unknown parameters.
GPR	Gaussian Process Regression	A non-parametric, Bayesian approach to regression where the model learns a distribution over functions, providing uncertainty estimates along with predictions.
GPT	Generative Pre-trained Transformer	An autoregressive language model that uses deep learning to produce human-like text.
GradCAM	GRADient-weighted Class Activation Mapping	A visualization technique for CNNs that uses the gradients flowing into the final convolutional layer to produce a coarse localization map highlighting important regions in the input image for predicting the concept.
HamNoSys	Hamburg Sign Language Notation System	An annotation system that describes sign language symbols.
HAN	Hierarchical Attention Network	A neural network architecture, typically used for document classification, employing attention mechanisms at both word and sentence levels to capture important information hierarchically.
HCA	Hierarchical Clustering Analysis	A method of cluster analysis which seeks to build a hierarchy of clusters, either agglomerative (bottom-up) or divisive (top-down).
HDP	Hierarchical Dirichlet process	A non-parametric Bayesian approach for modeling grouped data, often used in topic modeling to allow for an infinite number of topics shared across groups.
HHDS	HipHop Dataset	Likely refers to a specific dataset focused on Hip Hop music, used for tasks like music information retrieval (MIR), genre classification, or beat tracking.
hLDA	Hierarchical Latent Dirichlet allocation	An extension of LDA that organizes topics into a hierarchy, allowing documents to be associated with paths of topics at different levels of granularity.
HMM	Hidden Markov Model	A statistical Markov model in which the system being modeled is assumed to be a Markov process with unobserved (hidden) states, commonly used for sequential data like speech or NLP.
HNN	Hopfield Neural Network	A form of recurrent artificial neural network popularized by John Hopfield, serving as content-addressable ("associative") memory systems with binary threshold nodes.
i.i.d	Independent and Identically Distributed	A fundamental assumption in many statistical and machine learning models, stating that random variables in a sequence have the same probability distribution and are mutually independent.
ID3	Iterative Dichotomiser 3	An early algorithm used to generate a decision tree from a dataset, using information gain to select the best attribute at each step.
IDR	Input dependence rate	A metric possibly measuring how much a model's output or internal state depends on its input features, potentially used in explainability or sensitivity analysis.
IIR	Input independence rate	A metric likely measuring the degree to which a model's output is independent of its input features, possibly related to robustness or fairness evaluation.
INFD	Explanation Infidelity	A metric used in XAI to measure how poorly an explanation (e.g., feature attributions) reflects the actual behavior of the model when inputs are perturbed.
IoU	Jaccard index (intersection over union)	Metric in segmentation/object detection tasks. Ratio of areas of intersection and union of two (segmentation) boxes, corresponding to e.g. prediction and label.
ISIC	International Skin Imaging Collaboration	An academia-industry partnership focused on creating digital skin imaging standards and datasets for melanoma research, often used in computer vision challenges.
k-NN	k-Nearest Neighbor	A non-parametric, instance-based learning algorithm where classification or regression is based on the majority vote or average of the 'k' nearest neighbors in the feature space.
KAN	Kolmogorov-Arnold Networks	Ref. https://arxiv.org/abs/2404.19756v1 - A novel neural network architecture inspired by the Kolmogorov-Arnold representation theorem, potentially offering better interpretability and scaling properties compared to MLPs by using learnable activation functions on edges instead of fixed ones on nodes.
KDE	Kernel Density Estimation	A non-parametric way to estimate the probability density function of a random variable by placing kernels (usually Gaussian) over each data point.
KL	Kullback Leibler (KL) divergence	A measure of how one probability distribution diverges from a second, expected probability distribution; often used as a loss or regularization term (e.g., in VAEs).
kNN	k-Nearest Neighbours	A non-parametric supervised learning method used for classification and regression. (Synonym for k-NN)
KRR	Kernel Ridge Regression	A combination of Ridge Regression (L2-regularized linear regression) with the kernel trick, allowing it to learn non-linear functions in high-dimensional spaces.
LDA	Latent Dirichlet Allocation	A generative statistical model that allows sets of observations to be explained by unobserved groups that explain why some parts of the data are similar.
LDA	Linear Discriminant Analysis	A dimensionality reduction technique also used for classification, which aims to find a linear combination of features that characterizes or separates two or more classes.
LDADE	Latent Dirichlet Allocation Differential Evolution	Likely a hybrid approach combining LDA for topic modeling with Differential Evolution, possibly for optimizing LDA parameters or using topics within the DE process.
LightGBM	Light Gradient-Boosting Machine	Gradient boosting framework that uses tree based learning algorithms, originally developed by Microsoft. Known for efficiency and speed.
LIME	Local Interpretable Model-agnostic Explanations	An XAI technique that explains individual predictions of any black-box classifier by learning a simpler, interpretable model locally around the prediction.
LLM	Large Language Model	A deep learning model trained on vast amounts of text data, capable of understanding and generating human-like text for various NLP tasks.
LRP	Layer-wise Relevance Propagation	An XAI technique for deep neural networks that decomposes the output prediction backward through the layers to assign relevance scores to input features.
LSA	Latent semantic analysis	A technique in NLP using singular value decomposition (SVD) to analyze relationships between documents and terms, identifying latent semantic structures.
LSI	Latent Semantic Indexing	An indexing and retrieval method using LSA (SVD) to identify patterns in term-document relationships, improving information retrieval by handling synonymy and polysemy. (Often used interchangeably with LSA).
LSTM	Long Short-Term Memory	A recurrent neural network can process not only single data points (such as images) but also entire sequences of data (such as speech or video).
LTR	Learning To Rank	Application of machine learning to construct ranking models for information retrieval systems, ordering items based on relevance.
LVQ	Learning Vector Quantization	A prototype-based supervised classification algorithm, related to Self-Organizing Maps (SOM), that uses competitive learning to move prototypes towards or away from training instances based on class labels.
MADE	Masked Autoencoder for Distribution Estimation	An autoregressive model based on autoencoders, using carefully constructed masks to ensure that reconstructions respect autoregressive constraints, allowing for tractable density estimation.
MAE	Mean Absolute Error	Average of the absolute error between the actual and predicted values.
MAF	Masked Autoregressive Flows	A type of normalizing flow model for density estimation that uses masked autoregressive transformations (like MADE) to ensure invertibility and efficient computation.
MAP	Maximum A Posteriori (MAP) Estimation	A method for estimating unknown parameters in Bayesian statistics, finding the mode (peak) of the posterior distribution, incorporating prior knowledge.
MAPE	Mean Absolute Prediction Error	Percentage of the error between the actual and predicted values (often expressed as a percentage).
MARS	Multivariate Adaptive Regression Spline	Non-parametric regression technique, extends linear models. Note that the name is trademarked, open source implementations are often called "EARTH".
MART	Multiple Additive Regression Tree	Another name for Gradient Boosted Decision Trees (GBDT), particularly associated with Friedman's original work, emphasizing the additive nature of the tree ensemble.
MaxEnt	Maximum Entropy	Entropy a scientific concept as well as a measurable physical property that is most commonly associated with a state of disorderrandomnessor uncertainty.
MCLNN	Masked ConditionaL Neural Networks	Conditional neural networks where masking techniques might be applied, possibly to control information flow or enforce specific dependencies based on the condition.
MCMC	Markov Chain Monte Carlo	A class of algorithms for sampling from a probability distribution by constructing a Markov chain that has the desired distribution
MCS	Model contrast score
MDL	Minimum description length (MDL) principle
MDN	Mixture Density Network
MDP	Markov Decision Process
MDRNN	Multidimensional recurrent neural network
MER	Music Emotion Recognition
MINT	Mutual Information based Transductive Feature Selection
MIoU	Mean Intersection over Union	Metric in segmentation/object detection tasks. Mean of IoU's over classes.
ML	Machine Learning	The study of computer algorithms that can improve automatically through experience and by the use of data.
MLE	Maximum Likelihood Estimation
MLM	Music Language Models
MLP	Multi-Layer Perceptron	A fully connected class of feedforward artificial neural network
MPA	Mean Pixel Accuracy	Metric in segmentation/object detection tasks. Average ratio of correctly classified pixels by class.
MRR	Mean Reciprocal Rank
MRS	Music Recommender System
MSDAE	Modified Sparse Denoising Autoencoder
MSE	Mean Squared Error	Average of the squares of the error between the actual and predicted values
MSR	Music Style Recognition
NAS	Neural Architecture Search	A technique for automating the design of artificial neural networks.
NB	Na ̈ıve Bayes
NBKE	Na ̈ıve Bayes with Kernel Estimation
NER	Named Entity Recognition
NERQ	Named Entity Recognition in Query
NF	Normalizing Flow
NFL	No Free Lunch (NFL) theorem
NLP	Natural Language Processing
NLT	Neural Machine Translation	An approach to translation with the use of a neural network to predict a sequence of words.
NMS	Non Maximum Suppression	A technique used in Object Detection for removing redundand overlapping bounding boxes
NN	Neural Network
NNMODFF	Neural Network based Multi-Onset Detection Function Fusion
NPE	Neural Physical Engine
NRMSE	Normalized RMSE	Cross-entropy Metric based on the logistic function that measures the error between the actual and predicted values.
NST	Neural Style Transfer	A method that uses of deep neural networks for transfering style.
NTM	Neural Turing Machine
ODF	Onset Detection Function
OLR	Ordinary Linear Regression
OLS	Ordinary Least Squares
PA	Pixel Accuracy	Metric in segmentation/object detection tasks. Ratio of correctly classified over total number of pixels.
PACO	Poisson Additive Co-Clustering
PCA	Principal Component Analysis	The process of computing the principal components and using them to perform a change of basis on the data sometimes using only the first few principal components and ignoring the rest.
PEGASUS	Pre-training with Extracted Gap-Sentences for Abstractive Summarization
PLSI	Probabilistic Latent Semantic Indexing
PM	Project Manager
PMF	Probabilistic Matrix Factorization
PMI	Pointwise Mutual Information
PNN	Probabilistic Neural Network
POC	Proof of Concept
POMDP	Partially Observable Markov Decision Process
POS	Part of Speech (POS) Tagging
PPMI	Positive Pointwise Mutual Information
PReLU	Parametric Rectified Linear Unit-Yor Topic Modeling
PU	Positive Unlabaled	Machine learning paradigma to learn from only positive and unlabeled data.
PYTM	Pitman
RandNN	Random Neural Network
RANSAC	RANdom SAmple Consensus
RBF	Radial Basis Function
RBFNN	Radial Basis Function Neural Network
RBM	Restricted Boltzmann Machine
ReLU	Rectified Linear Unit	An activation function that allow fast and effective training of deep neural architectures on large and complex datasets.
REPTree	Reduced Error Pruning Tree
RF	Random Forest
RGB	Red Green Blue color model	An additive color model used for display of images
RICNN	Rotation Invariant Convolutional Neural Network
RIM	Recurrent Interence Machines
RIPPER	Repeated Incremental Pruning to Produce Error Reduction
RL	Reinforcement Learning
RLFM	Regression based latent factors
RLHF	Reinforcement learning from human feedback
RMSE	Root MSE	Squared root of MSE
RNN	Recurrent Neural Network
RNNLM	Recurrent Neural Network Language Model (RNNLM)
RoBERTa	Robustly Optimized BERT Pretraining Approach	Commonly used transformer-based language model.
ROC	Received Operating Characteristic	Curve that plots TPR versus FPR at different parameter settings
ROI	Region Of Interest
RR	Ridge Regression
RTRL	Real-Time Recurrent Learning
SAE	Stacked AE
SARSA	State-Action-Reward-State-Action
SBM	Stochastic block model
SBO	Structured Bayesian optimization
SBSE	Search-based software engineering
SCH	Stochastic convex hull
SDAE	Stacked DAE
seq2seq	Sequence to Sequence Learning	Desribes training approach to convert sequences from one domain (e.g. sentences in English) to sequences in another domain (e.g. the same sentences translated to French).
SER	Sentence Error Rate
SGBoost	Stochastic Gradient Boosting
SGD	Stochastic Gradient Descent
SGVB	Stochastic Gradient Variational Bayes
SHAP	SHapley Additive exPlanation
SHLLE	Supervised Hessian Locally Linear Embedding
Sign2(Gloss+Text)	Sign to Gloss and Text	A two-step process requires joint learning of sign language recognition and translation.
Sign2Gloss	A one to one translation from the single sign to the single gloss.
Sign2Text	A task of full translation from the sign language into the spoken one	grammar and syntax are included.
SLP	Single-Layer Perceptron
SLRT	Sign Language Recognition Transformer	an encoder transformer model trained to predict sign gloss sequences it takes spatial embeddings and learns spatio-temporal representations.
SLT	Sign Language Translation	A full translation of signs to a spoken language.
SLTT	Sign Language Translation Transformer	an autoregressive transformer decoder model trained on output from SLRT to predict one word at a time to generate the corresponding spoken language sentence.
SMBO	Sequential Model-Based Optimization
SOM	Self-Organizing Map	A self-organizing map (SOM) or self-organizing feature map (SOFM) is an unsupervised machine learning technique used to produce a low-dimensional (typically two-dimensional) representation of a higher dimensional data set while preserving the topological structure of the data
SpRay	Spectral Relevance Analysis	Global explainability method using spectral clustering and local explanations (LRP).
SSD	Single-Shot Detector	A type of object detector that consists of a single stage. Some examples are YOLO RetinaNet and EfficientDet.
SSL	Self-Supervised Learning
SSVM	Smooth support vector machine
ST	Style Transfer	An algorithm that allows to tranfer properties of one object to another (i.e. transfer painitning style to a photography).
STDA	Style Transfer Data Augmentation	A method using style transfer to augment dataset.
STL	Selt-Taught Learning
SVD	Singing Voice Detection
SVD	Singular Value Decomposition
SVM	Support Vector Machine	Supervised learning models with associated learning algorithms that analyze data for classification and regression analysis.
SVR	Support Vector Regression	Supervised learning models with associated learning algorithms that analyze data for regression analysis.
SVS	Singing Voice Separation
t-SNE	t-distributed stochastic neighbor embedding
T5	Text-To-Text Transfer Transformer	Transformer based language model that uses a text-to-text approach.
TD	Temporal Difference
TDA	Targeted Data Augmentation
TGAN	Temporal Generative Adversarial Network
THAID	THeta Automatic Interaction Detection
TINT	Tree-Interpreter
TLFN	Time-Lagged Feedforward Neural Network
TNR	True Negative Rate	Proportion of actual negatives that are correctly predicted
TPR	True Positive Rate	Proportion of actual positives that are correctly predicted
TRPO	Trust Region Policy Optimization
ULMFiT	Universal Language Model Fine-Tuning
V-Net	Volumetric Convolutional neural network	3D image segmentation based on a volumetric fully convolutional neural network
VAD	Voice Activity Detection
VAE	Variational AutoEncoder	An artificial neural network architecture belonging to the families of probabilistic graphical models and variational Bayesian methods.
VGG	Visual Geometry Group	Popular deep convolutional model designed for classification.
VPNN	Vector Product Neural Network
VQ-VAE	Vector Quantized Variational Autoencoders
VR	Virtual Reality
WER	Word Error Rate	metric to measure performance used in NLP solutions e.g. in automatic speech recognition (ASR).
WFST	Weighted finite-state transducer (WFST)
WMA	Weighted Majority Algorithm
WPE	Weighted Prediction Error
XAI	Explainable Artificial Intelligence	A set of processes and methods to make machine learning algorithms and its results more interpretable.
XGBoost	eXtreme Gradient Boosting
YOLO	You Only Look Once	Fast object detection algorithm.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
LICENSE		LICENSE
README.md		README.md
banner.png		banner.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Machine learning acronyms and abbreviations 🤖

Sources 📖

Contributing 📝

The list 📑

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Machine learning acronyms and abbreviations 🤖

Sources 📖

Contributing 📝

The list 📑

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages