de.aitools.aq.bighashmap.core
Class BigHashMap<V extends Value>

java.lang.Object
  extended by de.aitools.aq.bighashmap.core.BigHashMap<V>
Type Parameters:
V - the value type of this hash map
All Implemented Interfaces:
java.util.Map<java.lang.String,V>

public class BigHashMap<V extends Value>
extends java.lang.Object
implements java.util.Map<java.lang.String,V>

A simple hash map so store a large set of key/value pairs. This representation is backed by an external hash table, which is implemented in native code for best access performance. For simplicity this hash map takes String as its key type and some derivative of Value (only these in this project) as its value type.

Unlike most other Map implementations, this hash map will be constructed from a text representation of key/value pairs, which has to be provided by the user. Please take a look at the documentation for the exact Specification of this text format.

Once constructed the hash map is completely read-only. Hence the majority of Map's interface, which would modify the hash map data, is not supported. Keep in mind that the purpose of this hash map is efficient lookup behavior on very large key sets that do not fit into memory.

This hash map has no build-in thread-safety. The use of this class in multi-threaded applications requires to enclose each method invocation in a separate synchronized block.

Furthermore, due to the underlying C/C++ implementation, there is no Unicode support for the input files. Non-ASCII characters may be truncated, which could create key duplicates, which could cause the CMPH library to crash. We are working on this limitation.

Version:
$Id: BigHashMap.java,v 1.20 2012/05/18 15:53:56 trenkman Exp $
Author:
martin.trenkmann@uni-weimar.de

Nested Class Summary
 
Nested classes/interfaces inherited from interface java.util.Map
java.util.Map.Entry<K,V>
 
Method Summary
static
<V extends Value>
java.io.File
build(java.lang.Class<V> clazz, java.io.File inputDir, java.io.File outputDir)
          Function to build a BigHashMap instance.
 void clear()
          Not supported.
 void close()
          Closes the hash map and deletes the underlying native object.
 boolean containsKey(java.lang.Object key)
           
 boolean containsValue(java.lang.Object value)
          Not supported.
 java.util.Set<java.util.Map.Entry<java.lang.String,V>> entrySet()
          Not supported.
 V get(java.lang.Object key)
           
 boolean get(java.lang.String key, V value)
          Lookup a value with object reuse.
 boolean isEmpty()
           
 java.util.Set<java.lang.String> keySet()
          Not supported.
static
<V extends Value>
BigHashMap<V>
open(java.lang.Class<V> clazz, java.io.File indexFile, Memory memory)
          Opens and initializes the hash map represented by indexFile.
 V put(java.lang.String key, V value)
          Not supported.
 void putAll(java.util.Map<? extends java.lang.String,? extends V> m)
          Not supported.
 V remove(java.lang.Object key)
          Not supported.
 int size()
          Note that for big hash maps the actual size does not fit into Integer.MAX_VALUE.
 long sizeLong()
          Returns the size of this hash map as long value, since it is not guaranteed that the actual size as int provided by size() fits into Integer.MAX_VALUE.
 java.util.Collection<V> values()
          Not supported.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface java.util.Map
equals, hashCode
 

Method Detail

build

public static <V extends Value> java.io.File build(java.lang.Class<V> clazz,
                                                   java.io.File inputDir,
                                                   java.io.File outputDir)
Function to build a BigHashMap instance.

Parameters:
clazz - the class object of the value type. If the value type is not supported an UnsupportedOperationException will be thrown. Otherwise, if this value type is supported, but does not match the type serialized in the input files, will cause parsing errors or undefined behavior.
inputDir - the directory that contains one or more input files, and nothing more. For the format specification take a look at the documentation.
outputDir - the already existing, but empty directory, where the external hash map will be stored.
Returns:
The path to the indexFile to open the hash map.
Throws:
java.lang.UnsupportedOperationException

open

public static <V extends Value> BigHashMap<V> open(java.lang.Class<V> clazz,
                                                   java.io.File indexFile,
                                                   Memory memory)
Opens and initializes the hash map represented by indexFile.

Parameters:
clazz - the class object of the value type. If the value type is not supported some exception will be thrown. Otherwise, if this value type is supported, but does not match the native type indeed, it will cause undefined behavior.
indexFile - the path to the initialization file, which was returned by build(Class, File, File) prior.
memory - A hint how many memory can be used by that instance at runtime. Currently if you can provide enough memory, the entire table data is mapped into memory to avoid disk I/O at runtime.

close

public void close()
Closes the hash map and deletes the underlying native object. This is not a mandatory cleanup, but convenient to explicitly free the allocated memory of the native object. After calling this method the hash map cannot be used any longer.


sizeLong

public long sizeLong()
Returns the size of this hash map as long value, since it is not guaranteed that the actual size as int provided by size() fits into Integer.MAX_VALUE.

Returns:
the actual number of entries in this hash map
See Also:
size()

size

public int size()
Note that for big hash maps the actual size does not fit into Integer.MAX_VALUE. Use sizeLong() instead.

Specified by:
size in interface java.util.Map<java.lang.String,V extends Value>
Returns:
the actual number of entries in this hash map
See Also:
sizeLong()

isEmpty

public boolean isEmpty()
Specified by:
isEmpty in interface java.util.Map<java.lang.String,V extends Value>

containsKey

public boolean containsKey(java.lang.Object key)
Specified by:
containsKey in interface java.util.Map<java.lang.String,V extends Value>

get

public V get(java.lang.Object key)
Specified by:
get in interface java.util.Map<java.lang.String,V extends Value>

get

public boolean get(java.lang.String key,
                   V value)
Lookup a value with object reuse. This method returns true if some mapping for key exists and false otherwise. In both cases the internal value of value may change, but has to be ignored for the latter case.
Use with caution! After running some JUnit tests it seems that this method crashes the JVM sometimes. This might be an JVM/JNA threading issue and is not originated in the native library. However, the native library is thread-safe and free of memory leaks. Running exactly the same unit test on the native side did not reveal any problems.
You can use get(Object) instead, which seems to be more stable.

Parameters:
key -
value -
Returns:
true if a value for key was found, false otherwise

containsValue

public boolean containsValue(java.lang.Object value)
Not supported.

Specified by:
containsValue in interface java.util.Map<java.lang.String,V extends Value>

put

public V put(java.lang.String key,
             V value)
Not supported.

Specified by:
put in interface java.util.Map<java.lang.String,V extends Value>

remove

public V remove(java.lang.Object key)
Not supported.

Specified by:
remove in interface java.util.Map<java.lang.String,V extends Value>

putAll

public void putAll(java.util.Map<? extends java.lang.String,? extends V> m)
Not supported.

Specified by:
putAll in interface java.util.Map<java.lang.String,V extends Value>

clear

public void clear()
Not supported.

Specified by:
clear in interface java.util.Map<java.lang.String,V extends Value>

keySet

public java.util.Set<java.lang.String> keySet()
Not supported.

Specified by:
keySet in interface java.util.Map<java.lang.String,V extends Value>

values

public java.util.Collection<V> values()
Not supported.

Specified by:
values in interface java.util.Map<java.lang.String,V extends Value>

entrySet

public java.util.Set<java.util.Map.Entry<java.lang.String,V>> entrySet()
Not supported.

Specified by:
entrySet in interface java.util.Map<java.lang.String,V extends Value>