javax.datamining.clustering
Interface Cluster


public interface Cluster

A Cluster object holds the metadata about a cluster discovered by running a clustering algorithm. The clusters are characterized such that the similarity (or distance) between any two points in a cluster is higher (or closer) than that between any two points from two difference clusters. Clusters are associated with a ClusteringModel.

Author:
JSR-73 Java Data Mining Expert Group

Method Summary
 Cluster[] getAncestors()
          Returns the ancestors of the cluster.
 long getCaseCount()
          Returns the number of cases in the portion of the training data assigned to the cluster during the model build, inclusive of children counts.
 java.lang.Double getCentroidCoordinate(java.lang.String numericalAttributeName)
          Returns the centroid point of the specified numerical attribute for the cluster.
 java.lang.Double getCentroidCoordinate(java.lang.String categoricalAttributeName, java.lang.Object category)
          Returns the centroid point of the specified categorical attribute for a specific category value for the cluster.
 Cluster[] getChildren()
          Returns an array of Cluster objects that are children of the cluster node.
 int getClusterId()
          Returns the cluster identifier.
 int getLevel()
          Returns the level in the clustering tree associated with the Cluster object.
 java.lang.String getName()
          Returns the name of the cluster designated by the clustering algorithm.
 Cluster getParent()
          Returns the parent of the cluster.
 Rule getRule()
          Returns the rule of the cluster.
 Predicate getSplitPredicate()
          Returns a Predicate object that stores information on how cases are assigned to the cluster node's children.
 AttributeStatisticsSet getStatistics()
          Returns the AttributeStatisticsSet object that describes the data assigned to the cluster.
 double getSupport()
          Returns the support defined as the percentage of cases assigned to this cluster relative to the total number of cases in the training data.
 boolean isLeaf()
          Returns true if the cluster is a leaf node in a hierarchical clustering model and false if it is an internal node.
 boolean isRoot()
          Returns true if the cluster is a root node in a hierarchical clustering model and false if it is an internal node.
 

Method Detail

getAncestors

public Cluster[] getAncestors()
                       throws JDMException
Returns the ancestors of the cluster. The ancestors of a cluster are the list of clusters that connects the current cluster to the root cluster of a hierarchical clustering model. Returns an empty array for non-hierarchical model. Ancestors are ordered by distance from current node, parent first.

Returns:
Cluster[]
Throws:
JDMException

getCaseCount

public long getCaseCount()
Returns the number of cases in the portion of the training data assigned to the cluster during the model build, inclusive of children counts. This is a non-negative value.

Returns:
long

getCentroidCoordinate

public java.lang.Double getCentroidCoordinate(java.lang.String numericalAttributeName)
                                       throws JDMException
Returns the centroid point of the specified numerical attribute for the cluster. Returns null if the cluster has missing values for all cases in the cluster, or if this model does not support centroids. If invoked on a categorical attribute, throws an exception.

Parameters:
numericalAttributeName - The attribute name whose centroid coordinate is to be returned. The attribute must be numerical.
Returns:
Double
Throws:
JDMException

getCentroidCoordinate

public java.lang.Double getCentroidCoordinate(java.lang.String categoricalAttributeName,
                                              java.lang.Object category)
                                       throws JDMException
Returns the centroid point of the specified categorical attribute for a specific category value for the cluster. Returns null if the cluster has missing values for all cases in the cluster, or if this model does not support centroids. If a numerical attribute is provided, throws an exception. Result is greater than or equal to 0, or null if all values in attribute are null.

Parameters:
categoricalAttributeName - The attribute name whose centroid coordinate is to be returned. The attribute must be categorical.
category - The category value.
Returns:
Double
Throws:
JDMException

getChildren

public Cluster[] getChildren()
                      throws JDMException
Returns an array of Cluster objects that are children of the cluster node. The method returns null if the cluster node has no children. The children of a cluster represent the direct descendents of this cluster in the clustering hierarchy. Leaf nodes have no children and a null is returned.

Returns:
Cluster[]
Throws:
JDMException

getClusterId

public int getClusterId()
Returns the cluster identifier.

Returns:
int

getLevel

public int getLevel()
Returns the level in the clustering tree associated with the Cluster object. A tree level represents the distance between a cluster node and the root node within a clustering hierarchy. That is, how many splits in a tree branch were required to produce a node. The root node is at level 0. Its children are at level 1. Their children are at level 2, etc.

Returns:
int

getName

public java.lang.String getName()
Returns the name of the cluster designated by the clustering algorithm.

Returns:
String

getParent

public Cluster getParent()
                  throws JDMException
Returns the parent of the cluster. Returns null if the cluster node has no parent. A parent of a cluster is its direct ancestor in the hierarchical tree. The root of the tree has no ancestor.

Returns:
Cluster
Throws:
JDMException

getRule

public Rule getRule()
Returns the rule of the cluster.

Returns:
Rule

getSplitPredicate

public Predicate getSplitPredicate()
Returns a Predicate object that stores information on how cases are assigned to the cluster node's children. Since the clustering tree is binary, the predicate defines the conditions for one of the branches. Cases not satisfying these conditions fall into the other branch. The method is relevant only for clustering models created by a clustering algorithm that recursively partitions the space with axis parallel splits. If a cluster node has no children, this method returns null.

Returns:
Predicate

getStatistics

public AttributeStatisticsSet getStatistics()
                                     throws JDMException
Returns the AttributeStatisticsSet object that describes the data assigned to the cluster. This may be null if the statistics of the cluster is not available.

Returns:
AttributeStatisticsSet
Throws:
JDMException

getSupport

public double getSupport()
Returns the support defined as the percentage of cases assigned to this cluster relative to the total number of cases in the training data. Ranges between 0 and 100 as a percentage of the cases that belong in the cluster.

Returns:
double

isLeaf

public boolean isLeaf()
Returns true if the cluster is a leaf node in a hierarchical clustering model and false if it is an internal node. This method always returns true for the clusters in a non-hierarchical clustering model.

Returns:
boolean

isRoot

public boolean isRoot()
Returns true if the cluster is a root node in a hierarchical clustering model and false if it is an internal node. This method always returns true for the clusters in a non-hierarchical clustering model.

Returns:
boolean