de.aitools.dm.clusterlabeling.common
Class SimpleTokenizer

java.lang.Object
  extended by de.aitools.dm.clusterlabeling.common.SimpleTokenizer
All Implemented Interfaces:
Tokenizer

public class SimpleTokenizer
extends java.lang.Object
implements Tokenizer

Version:
$Id: SimpleTokenizer.java,v 1.1 2011/11/15 10:51:25 hoppe Exp $
Author:
dennis.hoppe(/\t)uni-weimar.de

Constructor Summary
SimpleTokenizer(Decomposition decomposer, StopWordList stopwords, Stemmer stemmer)
           
SimpleTokenizer(java.util.Locale locale)
           
 
Method Summary
 java.util.Set<java.lang.String> filter(java.util.Set<java.lang.String> text, java.lang.String filter)
           
 java.util.Set<java.lang.String> tokenize(java.lang.String text)
           
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SimpleTokenizer

public SimpleTokenizer(Decomposition decomposer,
                       StopWordList stopwords,
                       Stemmer stemmer)

SimpleTokenizer

public SimpleTokenizer(java.util.Locale locale)
Method Detail

tokenize

public java.util.Set<java.lang.String> tokenize(java.lang.String text)
Specified by:
tokenize in interface Tokenizer

filter

public java.util.Set<java.lang.String> filter(java.util.Set<java.lang.String> text,
                                              java.lang.String filter)
Specified by:
filter in interface Tokenizer