de.aitools.aq.invertedindex.core
Class Indexer<V extends Value>

java.lang.Object
  extended by de.aitools.aq.invertedindex.core.Indexer<V>

public class Indexer<V extends Value>
extends java.lang.Object

A class to build an inverted index programmatically from a number of records. The created index is static and read-only.

Version:
$Id: Indexer.java,v 1.8 2011/04/18 19:57:36 trenkman Exp $
Author:
martin.trenkmann@uni-weimar.de

Method Summary
 void close()
          Closes and deletes the native index.
 Properties index()
          Starts the final indexing process.
static
<V extends Value>
Indexer<V>
open(java.lang.Class<V> clazz, Configuration config)
          Instantiates a new writable inverted index in a dedicated directory, that can be filled programmatically with a (huge) number of records.
 boolean put(Record<V> record)
          Inserts a record to the indexer.
 boolean put(java.lang.String key, V value)
          Inserts a key/value pair to the indexer.
 void setExpectedNumberOfRecords(long numberOfRecords)
          A tuning method to indicate the expected number of records to be inserted to the indexer.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

open

public static <V extends Value> Indexer<V> open(java.lang.Class<V> clazz,
                                                Configuration config)

Instantiates a new writable inverted index in a dedicated directory, that can be filled programmatically with a (huge) number of records.

Please read the documentation to setup the Configuration object properly. In contrast to ManagedIndexer this indexing method provides Unicode support and allows whitespaces (no tabs) in key and value strings.

The configuration parameter required for this job are:

Type Parameters:
V - the value type parameter
Parameters:
clazz - the class object of the value type
config - the indexing job configuration
Returns:
an instance of Indexer ready to write

close

public void close()
Closes and deletes the native index. This method is redundant to a successful call of index(), which even closes the index. However, you can use this method in some cleanup code, if your code interrupts abnormally during the put(Record) phase.


setExpectedNumberOfRecords

public void setExpectedNumberOfRecords(long numberOfRecords)

A tuning method to indicate the expected number of records to be inserted to the indexer. Estimating such upper bound in an early stage of put(Record) or put(String, Value) requests can increase the indexing performance dramatically.

Note that this value is only a hint and works only if Configuration.KeySorting.UNSORTED is used. A wrong estimation definitely has no negative impact to the indexing behavior.

Parameters:
numberOfRecords - some upper bound of records to be inserted
See Also:
Configuration.setExpectedNumberOfRecords(long)

index

public Properties index()

Starts the final indexing process. This method has to be called at the very end of a sequence of put(Record) or put(String, Value) requests. Afterwards the native indexer will be closed and deleted automatically, and is not usable any longer.

Please note, depending of the number of inserted records this method can take a while, so be patient and enjoy the console output ...

Returns:
a Properties object with some information about the created index.
See Also:
put(Record), put(String, Value)

put

public boolean put(java.lang.String key,
                   V value)

Inserts a key/value pair to the indexer. Note that this method is not thread-safe, as it uses shared memory internally to transfer the data to the native side. On multi-threaded code you have to synchronize calls to this method.

Parameters:
key - the key to be inserted
value - the value to be inserted
Returns:
true if that record could be inserted successfully, false otherwise
See Also:
put(Record)

put

public boolean put(Record<V> record)

Inserts a record to the indexer. Note that this method is not thread-safe, as it uses shared memory internally to transfer the data to the native side. On multi-threaded code you have to synchronize calls to this method.

Parameters:
record - the record to be inserted
Returns:
true if that record could be inserted successfully, false otherwise