|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectde.aitools.dm.clustering.algorithms.ASoftClusterer
de.aitools.dm.clustering.algorithms.AClusterer
de.aitools.dm.clustering.algorithms.SLink
public final class SLink
A algorithm for the single link hierarchical clustering method. It starts
with each data vector being it's own cluster and iteratively merges the
two clusters that have the highest proximity.
It is integrated into the TIRA Framework (see SLink(Configuration)
),
but can be used without it: clusterDendrogram(Vector[])
is the
usual way of using this algorithm. If you only want to merge until a certain
threshold, you can use cluster(Vector[], double)
. If you want to
merge until there are only a certain number of clusters left, you can use
cluster(Vector[], int)
.
The algorithm runs in O(N2)--with
N being the number of input vectors--as all pairwise proximities
must be computed. This algorithm only takes place of O(N), because
these proximities are not used at a time but in a sequence of chunks.
It is taken from:
Sibson R.
SLINK: An optimally efficient algorithm for the single-link cluster method.
Computer Journal, 16, pages 30-34 (1973)
BibTeX:
@article{Sibson1973, author = {Sibson R.}, title = {SLINK: An optimally efficient algorithm for the single-link cluster method}, journal = {Computer Journal}, year = {1973}, volume = {16}, pages = {30-34} }
Field Summary |
---|
Fields inherited from interface de.aitools.dm.clustering.Clusterer |
---|
DEFAULT_SEED |
Constructor Summary | |
---|---|
SLink(Configuration configuration)
Create a new Single Link Clusterer. |
|
SLink(Proximity<Vector> proximityMeasure)
Create a new Single Link Clusterer. This allows only to use cluster(Vector[], int) ,
cluster(Vector[], double) and
clusterDendrogram(Vector[]) . |
Method Summary | |
---|---|
int[] |
cluster(Vector[] data)
This method is used for clustering via the TIRA Framework. |
int[] |
cluster(Vector[] data,
double threshold)
Cluster given data hierarchically until the proximities between all clusters is less or equal to threshold. |
int[] |
cluster(Vector[] data,
int numClusters)
Cluster given data hierarchically until only numClusters are left. |
Dendrogram<DoubleMerge> |
clusterDendrogram(Vector[] data)
Cluster given data hierarchically. |
void |
setNumberOfClusters(int numClusters)
Made for integration into the TIRA Framework. |
void |
setProximityMeasure(Proximity<Vector> proximityMeasure)
Sets the proximity measure to be used for the clustering steps. |
Methods inherited from class de.aitools.dm.clustering.algorithms.AClusterer |
---|
cluster, cluster, cluster, clusterSoft |
Methods inherited from class de.aitools.dm.clustering.algorithms.ASoftClusterer |
---|
clusterSoft, clusterSoft, clusterSoft, getBiggestRange |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public SLink(Proximity<Vector> proximityMeasure)
cluster(Vector[], int)
,
cluster(Vector[], double)
and
clusterDendrogram(Vector[])
. The other methods are made for
the TIRA Framework and can be used if additionally a target number of
clusters was set (see setNumberOfClusters(int)
).
proximityMeasure
- The proximity measure to use on clustering.SLink
public SLink(Configuration configuration)
configuration
- Object for configuring this clusterer:Proximity
<Vector> as in
SLink(Proximity)
.cluster(Vector[])
as they are used by TIRA.SLink
Method Detail |
---|
public void setProximityMeasure(Proximity<Vector> proximityMeasure)
proximityMeasure
- The measure to use.public void setNumberOfClusters(int numClusters)
cluster(Vector[])
through which the number of clusters can
not further be specified, this method (or SLink(Configuration)
)
can be used to tell the algorithm how many clusters to generate.
numClusters
- Number of clusters to generate. Must be greater than
zero.public int[] cluster(Vector[] data, int numClusters)
data
- The vectors to cluster.numClusters
- Number of clusters to generate.
cluster(Vector[], double)
,
clusterDendrogram(Vector[])
public int[] cluster(Vector[] data, double threshold)
data
- The vectors to cluster.threshold
- The threshold for clustering.
cluster(Vector[], int)
,
clusterDendrogram(Vector[])
public int[] cluster(Vector[] data)
SLink(Configuration)
or from setNumberOfClusters(int)
.
If you are not using TIRA, you can
use clusterDendrogram(Vector[])
to get a complete dendrogram
of the clustering process.
cluster
in interface Clusterer
cluster
in class AClusterer
data
- The vectors to cluster.
SLink(Configuration)
,
setNumberOfClusters(int)
,
cluster(Vector[], int)
,
cluster(Vector[], double)
,
clusterDendrogram(Vector[])
public Dendrogram<DoubleMerge> clusterDendrogram(Vector[] data)
data
- The data to be clustered.
Dendrogram
of the clustering process.
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |