Oracle Advanced Analytics' Machine Learning Algorithms SQL Functions

Oracle Advanced Analytic's provides a broad range of in-database, parallelized implementations of machine learning algorithms to solve many types of business problems.

See Oracle Advanced Analytics Documentation for more information and details on each algorithm, settings and API calls. See DBMS_DATA_MINING in Database PL/SQL Packages and Types Reference.

Technique Applicability Algorithms


Most commonly used technique for predicting a specific outcome such as response / no-response, high / medium / low-value customer, likely to buy / not buy.

Generalized Linear Models Logistic Regression —classic statistical technique available inside the Oracle Database in a highly performant, scalable, parallized implementation (applies to all OAA ML algorithms). Supports text and transactional data (applies to nearly all OAA ML algorithms)

Naive Bayes —Fast, simple, commonly applicable. Leverages Database's speed in counting.

Support Vector Machine —Newer generation machine learning algorithm, supports text and wide data.

Decision Tree —Popular ML algorithm for interpretability. Provides human-readable "rules".


Technique for predicting a continuous numerical outcome such as customer lifetime value, house value, process yield rates.

Generalized Linear Models Multiple Regression —classic statistical technique but now available inside the Oracle Database as a highly performant, scalable, parallized implementation. Supports ridge regression, feature creation and feature selection. Supports text and transactional data.

Support Vector Machine —Newer generation machine learning algorithm, supports text and wide data.

Attribute Importance

Ranks attributes according to strength of relationship with target attribute. Use cases include finding factors most associated with customers who respond to an offer, factors most associated with healthy patients. Minimum Description Length —Considers each attribute as a simple predictive model of the target class and provides relative influence.

Anomaly Detection

Identifies unusual or suspicious cases based on deviation from the norm. Common examples include health care fraud, expense report fraud, and tax compliance. One-Class Support Vector Machine —Trains on "normal" cases and then flag unusual cases.


Useful for exploring data and finding natural groupings. Members of a cluster are more like each other than they are like members of a different cluster. Common examples include finding new customer segments, and life sciences discovery.

Enhanced K-Means —Supports text mining, hierarchical clustering, distance based.

Orthogonal Partitioning Clustering —Hierarchical clustering, density based.

Expectation Maximization —Clustering technique that performs well in mixed data (dense and sparse) data mining problems.


Finds rules associated with frequently co-occuring items, used for market basket analysis, cross-sell, root cause analysis. Useful for product bundling, in-store placement, and defect analysis. Apriori —Industry standard for market basket analysis.

Feature Selection and Extraction

Produces new attributes as linear combination of existing attributes. Applicable for text data, latent semantic analysis, data compression, data decomposition and projection, and pattern recognition.

Non-negative Matrix Factorization —Maps the original data into the new set of attributes

Principal Components Analysis (PCA)—creates new fewer composite attributes that respresent all the attributes.

Singular Vector Decomposition —established feature extraction method that has a wide range of applications.