de.aitools.ie.languagedetection
Class LanguageDetector

java.lang.Object
  extended by de.aitools.ie.languagedetection.LanguageDetector

public class LanguageDetector
extends java.lang.Object

This class is the main interface to the language detection package. TODO(loose): The interface (getLanguage() method) could as well be static? But in future one could want to get more information about for example the second most probable language or the distance from the most probable language to the next one... TODO(loose): fix models: pl,lt -- these (and probably a few other) models are the best guess when the text contains many white spaces, special character etc. ... so the language (wiki) corpus still seems to have problems. -- good test with vertical search results, as these texts somehow randomly come from the web

Author:
fabian.loose@uni-weimar.de, martin.potthast@uni-weimar.de

Field Summary
static java.lang.String SERIALIZATION_NAME
           
 
Constructor Summary
LanguageDetector()
           
 
Method Summary
 java.util.Locale detect(java.lang.String s)
          Detects the language of a string based on its character trigrams.
static void main(java.lang.String[] args)
          Required so that the language model index can be initialized by Ant.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

SERIALIZATION_NAME

public static final java.lang.String SERIALIZATION_NAME
See Also:
Constant Field Values
Constructor Detail

LanguageDetector

public LanguageDetector()
Method Detail

detect

public java.util.Locale detect(java.lang.String s)
Detects the language of a string based on its character trigrams.


main

public static void main(java.lang.String[] args)
Required so that the language model index can be initialized by Ant.