de.aitools.dm.clusterlabeling.common
Class SimpleTokenizer
java.lang.Object
de.aitools.dm.clusterlabeling.common.SimpleTokenizer
- All Implemented Interfaces:
- Tokenizer
public class SimpleTokenizer
- extends java.lang.Object
- implements Tokenizer
- Version:
- $Id: SimpleTokenizer.java,v 1.1 2011/11/15 10:51:25 hoppe Exp $
- Author:
- dennis.hoppe(/\t)uni-weimar.de
Method Summary |
java.util.Set<java.lang.String> |
filter(java.util.Set<java.lang.String> text,
java.lang.String filter)
|
java.util.Set<java.lang.String> |
tokenize(java.lang.String text)
|
Methods inherited from class java.lang.Object |
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
SimpleTokenizer
public SimpleTokenizer(Decomposition decomposer,
StopWordList stopwords,
Stemmer stemmer)
SimpleTokenizer
public SimpleTokenizer(java.util.Locale locale)
tokenize
public java.util.Set<java.lang.String> tokenize(java.lang.String text)
- Specified by:
tokenize
in interface Tokenizer
filter
public java.util.Set<java.lang.String> filter(java.util.Set<java.lang.String> text,
java.lang.String filter)
- Specified by:
filter
in interface Tokenizer