

PREV CLASS NEXT CLASS  FRAMES NO FRAMES  
SUMMARY: NESTED  FIELD  CONSTR  METHOD  DETAIL: FIELD  CONSTR  METHOD 
java.lang.Object de.aitools.dm.clustering.algorithms.ASoftClusterer de.aitools.dm.clustering.algorithms.AClusterer de.aitools.dm.clustering.algorithms.KNNHAC
public final class KNNHAC
Class for clustering Vector
s using the
Hierarchical Agglomerative Clustering algorithm (EfficientHAC)
described in Introduction to Information Retrieval
.
In Agglomerative Hierarchical Clustering at the start of the algorithm all
data points are a cluster themselves. In each iteration two clusters are
merged until only the desired number of clusters are left. The two
clusters are chosen, which Proximity
is highest. However, there
are different methods to define the proximity between two clusters.
Implemented at the moment are:
Distance
measure as Proximity
. For this implementation all
measures have to be symmetric.
@book{BackhausErichsonPlinkeWeiber2006, author = {Backhaus Klaus and Erichson Bernd and Plinke Wulff and Weiber Rolf}, publisher = {Springer}, title = {Multivariate Analysemethoden}, year = {2006} }
@book{ManningRaghavanHinrich2008, address = {New York, NY}, author = {Manning Christopher D. and Raghavan Prabhakar and Schütze Hinrich}, publisher = {Cambridge University Press}, title = {Introduction to Information Retrieval}, year = {2008} }
@book{TanSteinbachKumar2006, address = {Boston, MA}, author = {Tan PangNing and Steinbach Michael and Kumar Vipin}, publisher = {Pearson Education}, title = {Introduction to Data Mining}, year = {2006} }
@article{LanceWilliams1967, author = {Lance G. N. and Williams W. T.}, publisher = {Computer Journal}, title = {A general theory of classificatory sorting strategies. 1. Hierarchical Systems}, year = {1967} }
Field Summary 

Fields inherited from interface de.aitools.dm.clustering.Clusterer 

DEFAULT_SEED 
Constructor Summary  

KNNHAC(Configuration configuration)
Create a new KNearestNeighborHierarchicalAgglomerativeClusterer ( KNNHAC ). 

KNNHAC(HACClusterMethod clusterMethod,
Proximity<Vector> proximityMeasure)
Create a new KNearestNeighborHierarchicalAgglomerativeClusterer ( KNNHAC ) using the default value for the number of neighbors
(see setNumberOfNeighbors(int) ). 

KNNHAC(HACClusterMethod clusterMethod,
Proximity<Vector> proximityMeasure,
int numNeighbors)
Create a new KNearestNeighborHierarchicalAgglomerativeClusterer ( KNNHAC ). 
Method Summary  

int[] 
cluster(Vector[] data)
This method is used for clustering via the TIRA Framework. 
int[] 
cluster(Vector[] data,
double threshold)
Cluster given data hierarchically until the proximities between all clusters is less or equal to threshold. 
int[] 
cluster(Vector[] data,
int numClusters)
Cluster given data hierarchically until only numClusters are left. 
Dendrogram<DoubleMerge> 
clusterDendrogram(Vector[] data)
Cluster given data hierarchically. 
HACClusterMethod 
getClusterMethod()

int 
getNumberOfNeighbors()

Proximity<Vector> 
getProximityMeasure()

static void 
main(java.lang.String[] args)

void 
setClusterMethod(HACClusterMethod clusterMethod)
This is a general implementation of a hierarchical agglomerative clustering algorithm (HAC). 
void 
setNumberOfClusters(int numClusters)
Made for integration into the TIRA Framework. 
void 
setNumberOfNeighbors(int numNeighbors)
This sets the number of neighbors for the KNearestNeighborGraph that is used in this algorithm. 
void 
setProximityMeasure(Proximity<Vector> proximityMeasure)
Sets the proximity measure to be used for the clustering steps. 
Methods inherited from class de.aitools.dm.clustering.algorithms.AClusterer 

cluster, cluster, cluster, clusterSoft 
Methods inherited from class de.aitools.dm.clustering.algorithms.ASoftClusterer 

clusterSoft, clusterSoft, clusterSoft, getBiggestRange 
Methods inherited from class java.lang.Object 

equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait 
Constructor Detail 

public KNNHAC(HACClusterMethod clusterMethod, Proximity<Vector> proximityMeasure)
KNNHAC
) using the default value for the number of neighbors
(see setNumberOfNeighbors(int)
).
clusterMethod
 As in setClusterMethod(HACClusterMethod)
.proximityMeasure
 As in setProximityMeasure(Proximity)
.KNNHAC(HACClusterMethod, Proximity, int)
,
KNNHAC(Configuration)
public KNNHAC(HACClusterMethod clusterMethod, Proximity<Vector> proximityMeasure, int numNeighbors)
KNNHAC
).
clusterMethod
 As in setClusterMethod(HACClusterMethod)
.proximityMeasure
 As in setProximityMeasure(Proximity)
.numNeighbors
 As in setNumberOfNeighbors(int)
.KNNHAC(HACClusterMethod, Proximity)
,
KNNHAC(Configuration)
public KNNHAC(Configuration configuration)
KNNHAC
).
configuration
 Object for configuring this clusterer:HACClusterMethod
to use. See
setClusterMethod(HACClusterMethod)
.Proximity
<Vector> as in
setProximityMeasure(Proximity)
.setNumberOfNeighbors(int)
.KNNHAC(HACClusterMethod, Proximity)
,
KNNHAC(HACClusterMethod, Proximity, int)
Method Detail 

public void setClusterMethod(HACClusterMethod clusterMethod)
clusterMethod
 The cluster method to use.public void setNumberOfNeighbors(int numNeighbors)
KNNGraph.createUndirectedKNNIntGraph(
Vector[], Proximity, int, double)
for an explanation.
numNeighbors
 Number of neighbors for the graph as shown above.public void setProximityMeasure(Proximity<Vector> proximityMeasure)
proximityMeasure
 The measure to use.public void setNumberOfClusters(int numClusters)
cluster(Vector[])
through which the number of clusters can
not further be specified, this method (or KNNHAC(Configuration)
)
can be used to tell the algorithm how many clusters to generate.
numClusters
 Number of clusters to generate. Must be greater than
zero.public HACClusterMethod getClusterMethod()
setClusterMethod(HACClusterMethod)
public Proximity<Vector> getProximityMeasure()
setProximityMeasure(Proximity)
public int getNumberOfNeighbors()
setNumberOfNeighbors(int)
for
more information.public int[] cluster(Vector[] data, int numClusters)
data
 The vectors to cluster.numClusters
 Number of clusters to generate.
cluster(Vector[], double)
,
clusterDendrogram(Vector[])
public int[] cluster(Vector[] data, double threshold)
data
 The vectors to cluster.threshold
 The threshold for clustering.
cluster(Vector[], int)
,
clusterDendrogram(Vector[])
public int[] cluster(Vector[] data)
KNNHAC(Configuration)
or from setNumberOfClusters(int)
.
If you are not using TIRA, you can
use clusterDendrogram(Vector[])
to get a complete dendrogram
of the clustering process.
cluster
in interface Clusterer
cluster
in class AClusterer
data
 The vectors to cluster.
KNNHAC(Configuration)
,
setNumberOfClusters(int)
,
clusterDendrogram(Vector[])
public Dendrogram<DoubleMerge> clusterDendrogram(Vector[] data)
data
 The data to be clustered.
Dendrogram
of the clustering process.public static void main(java.lang.String[] args)
args



PREV CLASS NEXT CLASS  FRAMES NO FRAMES  
SUMMARY: NESTED  FIELD  CONSTR  METHOD  DETAIL: FIELD  CONSTR  METHOD 