@webis-de/gen-ir-sim 0.7.0

@webis-de/gen-ir-sim

0.7.0

GenIRSim
run
simulate
evaluate
Evaluators
Evaluator ▸
- Instance members
- #evaluate
PromptedEvaluator
ReadabilityEvaluator
Systems
System ▸
- Instance members
- #search
BasicChatSystem ▸
- Instance members
- #search
GenerativeElasticSystem ▸
- Instance members
- #search
Users
User ▸
- Instance members
- #start
- #followUp
StaticUser
Touche25RADUser
Utility
templates ▸
- Static members
- .render
- .joinMessages
- .joinProperties
- .tsv2Contexts
Types
Evaluation
EvaluationResult
EVALUATION_RESULT ▸
- Static members
- .SCORE
- .EXPLANATION
LLM ▸
- Instance members
- #createAssistantMessage
- #createSystemMessage
- #createUserMessage
- #chat
- #json
LLMConfiguration
Logbook ▸
- Instance members
- #log
LogbookEntry ▸
- Instance members
- #time
- #source
- #action
- #data
- #isContinuationOf
- #hasContent
- #hasTextContent
- #getContent
Simulation
SystemResponse
SYSTEM_RESPONSE ▸
- Static members
- .UTTERANCE
- .RESULTS
- .RESULTS_PAGE
Topic
UserTurn
USER_TURN ▸
- Static members
- .UTTERANCE
- .SYSTEM_RESPONSE

GenIRSim

Flexible and easy-to-use simulation and evaluation framework for generative IR

A run of GenIRSim consists of two stages, which are executed sequentially by the main run function: simulation and evaluation. During simulation (see the simulate function), a user converses with a generative IR system. During evaluation (see the evaluate function), the generated conversation is judged by one or more evaluators. The run is defined using a configuration in JSON format (see the configuration parameter of the above mentioned functions and see the configurations directory for examples). If you create your own user, system, or evaluator classes, make sure to register them using the options.additionalUsers/Systems/Evaluators parameters of the above mentioned functions.

run

src/index.js

Simulates and evaluates an interaction with a generative information retrieval system.

run(configuration: (Object | string), options: Object?, replacements: (Object | Array | String)?): (Evaluation | Array)

Parameters

configuration ((Object | string)) The configuration for the simulation and evaluation, either as object or a JSON string

Name	Description
configuration.simulation `Object`	The configuration for the simulation, see simulate
configuration.evaluation `Object`	The configuration for the evaluation, see evaluate

options

(Object?
            = undefined)

Name	Description
options.logCallback `function?`	The function to consume all LogbookEntry of the simulation and evaluation
options.additionalUsers `Object?`	Object that contains non-standard {User} classes as values; if `configuration.simulation.user.class` is the same as a key of this object, the corresponding class will be instantiated and used
options.additionalSystems `Object?`	Object that contains non-standard {System} classes as values; if `configuration.simulation.system.class` is the same as a key of this object, the corresponding class will be instantiated and used
options.additionalEvaluators `Object?`	Object that contains non-standard {Evaluator} classes as values; if `configuration.evaluation.evaluators.[evaluatorName].class` is the same as a key of this object, the corresponding class will be instantiated and used

replacements

((Object | Array | String)?
            = undefined)

An object that specifies the value to replace template variables in the configuration ( {{variable}} ) by. If the configuration is a string , the replacement happens before JSON parsing, allowing to replace variables with JSON structures. Variables in the configuration for which no value is specified in replacements are ignored. If replacements is an array, this function is executed for each of its elements and the resulting list of evaluation objects is returned. If replacements is a string, it is treated as a tab-separated values files that specifies an array of replacements: the first line specifying the variable name of a column and the values in that column in other lines being the respective replacement.

Returns

(Evaluation | Array): The evaluation object or an array of these if replacements is an array or string; an empty object is returned in case of an error

simulate

src/index.js

Simulates an interaction with a generative information retrieval system.

Unless you do not want to evaluate, use run instead.

simulate(configuration: Object, options: Object?): Simulation

Parameters

configuration (Object) The configuration for the simulation

Name	Description
configuration.topic `Topic`	The topic for the simulation
configuration.user `Object`	The configuration passed to the user in the constructor
configuration.user.class `string`	The name of the user class, either one of the standard classes of GenIRSim or one in `additionalUsers`
configuration.system `Object`	The configuration passed to the system in the constructor
configuration.system.class `string`	The name of the system class, either one of the standard classes of GenIRSim or one in `additionalSystems`
configuration.maxTurns `number?`	The maximum number of user turns to simulate (default: 3)

options

(Object?
            = undefined)

Name	Description
options.logCallback `function?`	The function to consume all LogbookEntry of the simulation
options.additionalUsers `Object?`	Object that contains non-standard {User} classes as values; if `configuration.user.class` is the same as a key of this object, the corresponding class will be instantiated and used
options.additionalSystems `Object?`	Object that contains non-standard {System} classes as values; if `configuration.system.class` is the same as a key of this object, the corresponding class will be instantiated and used

Returns

Simulation: The simulation object

evaluate

src/index.js

Evaluates a simulated interaction with a generative information retrieval system.

evaluate(simulation: Simulation, configuration: Object, options: Object?): Evaluation

Parameters

simulation (Simulation) The simulation to evaluate

configuration (Object) The configuration for the evaluation

Name	Description
configuration.evaluators `Object`	An object where each value is another configuration object that (1) is passed to the respective evaluator in the the constructor and (2) has a property `class` that is the name of the evaluator class, either one of the standard classes of GenIRSim or one in `additionalEvaluators`

options

(Object?
            = undefined)

Name	Description
options.logCallback `function?`	The function to consume all LogbookEntry of the evaluation
options.additionalEvaluators `Object?`	Object that contains non-standard {Evaluator} classes as values; if `configuration.evaluators.[evaluatorName].class` is the same as a key of this object, the corresponding class will be instantiated and used

Returns

Evaluation: The evaluation object

Evaluators

Evaluator

src/evaluator.js

An evaluator to measure some quality score for single turns of a conversation and/or an entire conversation.

Evaluators can be stateful and must not be re-used between conversations. The method Evaluator#evaluate must always be called first to evaluate each turn, in order, starting with turnIndex = 0, and then to evaluate the entire conversation (leaving the turnIndex undefined).

The constructor of an evaluator must have two parameters:

The configuration that has to be passed via super(configuration) and is then available via this.configuration.
A Logbook that can be used to log the initialization process.

new Evaluator(configuration: Object)

Parameters

configuration (Object) The configuration for the evaluator

Instance Members

▸ evaluate(simulation, turnIndex, logbook)

src/evaluator.js

Evaluates one specific turn or the entire conversation.

Evaluators can be stateful. This method must always be called first to evaluate each turn, in order, starting with 0, and then to evaluate the entire conversation (leaving the turnIndex undefined). Evaluators must not be re-used to evaluate multiple conversations.

evaluate(simulation: Simulation, turnIndex: number, logbook: Logbook): (EvaluationResult | null)

Parameters

simulation (Simulation) The simulation to evaluate

turnIndex (number) Index of the user's turn (or rather the response to that turn) to be evaluated, starting with 0, or undefined to evaluate the entire conversation

logbook (Logbook) Uses its log function to log messages

Returns

(EvaluationResult | null): The result of the evaluation, with at least the score property, or null if the Evaluator does not evaluate single turns or the complete conversation and that is what was asked

PromptedEvaluator

src/evaluators/prompted-evaluator.js

An evaluator that prompts a language model for a score.

new PromptedEvaluator(configuration: Object, log: Logbook)

Parameters

configuration (Object) The configuration for the evaluator

Name	Description
configuration.llm `LLMConfiguration`	The configuration for the language model to be prompted
configuration.promt `string`	Template for the prompt to evaluate the system response. Variables: `{{x}}`: A property `x` of the configuration for the evaluator `{{variables.simulation}}`: The entire Simulation `{{variables.userTurn}}`: The specific user turn, especially with `variables.userTurn.utterance` and `variables.userTurn.SystemResponse.utterance`
configuration.requiredKeys `Array?`	The properties that the language model's response must have (in addition to EVALUATION_RESULT.SCORE )

log (Logbook) A function that takes log messages

ReadabilityEvaluator

src/evaluators/readability-evaluator.js

An evaluator that measures the readability of the system response.

new ReadabilityEvaluator(configuration: Object, log: Logbook)

Parameters

configuration (Object) The configuration for the evaluator

Name	Description
configuration.measure `string`	The key of the measure that should be used to calculate the score

log (Logbook) A function that takes log messages

Systems

System

src/system.js

A generative information retrieval system.

Systems can be stateful. However, users are not differentiated: the system can assume it is used by exactly one user. A separate system object must be instantiated for each simulated user.

The constructor of a system must have two parameters:

The configuration that has to be passed via super(configuration) and is then available via this.configuration.
A Logbook that can be used to log the initialization process.

new System(configuration: Object)

Parameters

configuration (Object) The configuration for the system

Instance Members

▸ search(userTurn, logbook)

src/system.js

Generates a response for the user's utterance.

Systems can be stateful. However, users are not differentiated: the system can assume it is used by exactly one user.

search(userTurn: UserTurn, logbook: Logbook): SystemResponse

Parameters

userTurn (UserTurn) The turn object with the user's utterance as utterance

logbook (Logbook) Uses its log function to log messages

Returns

SystemResponse: The system's response with a least the utterance set

BasicChatSystem

src/systems/basic-chat-system.js

A blackbox retrieval system that implements a basic chat API.

The API needs to consume a JSON object that has at least the property messages, which is an array of message objects. Each message object has the string property role, which is either assistant or user, and the string property content that contains the message text.

The API produces a JSON object that has at least the property content, which is the message text of the response.

new BasicChatSystem(configuration: Object, log: Logbook)

Parameters

configuration (Object) The configuration for the system

Name	Description
configuration.url `string`	The URL of the chat endpoint
configuration.request `string`	The object that is sent to the endpoint on each query with the messages added

log (Logbook) A function that takes log messages

Instance Members

▸ search(userTurn, logbook)

src/systems/basic-chat-system.js

Retrieves results for the user's query.

search(userTurn: UserTurn, logbook: Logbook): SystemResponse

Parameters

userTurn (UserTurn) The turn object with the user's utterance as utterance

logbook (Logbook) Uses its log function to log messages

Returns

SystemResponse: The system's response with the utterance set and the complete response of the system as response

GenerativeElasticSystem

src/systems/generative-elastic-system.js

A basic generative information retrieval system implemented using an LLM and a Elasticsearch server.

Properties of the SystemResponse objects that GenerativeElasticSystem#search produces are determined by the configuration.generation.message extended with SYSTEM_RESPONSE.RESULTS and (the same as one string) SYSTEM_RESPONSE.RESULTS_PAGE.

new GenerativeElasticSystem(configuration: Object, log: Logbook)

Parameters

configuration (Object) The configuration for the system

Name	Description
configuration.llm `LLMConfiguration`	The configuration for the language model employed during retrieval
configuration.preprocessing `Object?`	No preprocessing if this property is `undefined`
configuration.preprocessing.message `string?`	Template for the prompt to preprocess the user's utterance (no preprocessing will happen if `configuration.preprocessing` is `undefined` ). The LLM's response must be formatted as JSON. Variables: `{{x}}`: A property `x` of the configuration for the system `{{variables.messages}}`: The previous exchange betbeen user and system (assistant) rendered as string (templates#joinMessages) `{{variables.userTurn}}`: The last UserTurn, especially with `variables.userTurn.utterance`
configuration.preprocessing.requiredKeys `Array?`	The properties that the preprocessing response must have (none by default)
configuration.search `Object`
configuration.search.url `string`	The complete URL of the Elasticsearch server's API endpoint (up to but excluding `_search` )
configuration.search.query `string`	The Elasticsearch query object for retrieving results, but every string in it is treated as a template. Variables are the same as for `configuration.preprocessing.message` , plus: `{{variables.preprocessing}}`: The parsed output of the preprocessing (if preprocessing was performed)
configuration.search._source `Object?`	An object that specifies which source attributes to include in the response, see the Elasticsearch documentation
configuration.search.size `number`	The number of results to retrieve
configuration.generation `Object`
configuration.generation.message `string`	Template for the prompt to generate a system response for the user's utterance from the retrieved search results. The LLM's response must be formatted as JSON. Variables are the same as for `configuration.search.query` , plus: `{{variables.results}}`: The retrieved results rendered as a string
configuration.generation.searchResultKeys `Array?`	The properties of each result that are used to render the result in the generation message
configuration.generation.requiredKeys `Array?`	The properties that the generated response must have (in addition to SYSTEM_RESPONSE.UTTERANCE )

log (Logbook) A function that takes log messages

Instance Members

▸ search(userTurn, logbook)

src/systems/generative-elastic-system.js

Retrieves results for the user's query.

search(userTurn: UserTurn, logbook: Logbook): SystemResponse

Parameters

userTurn (UserTurn) The turn object with the user's utterance as utterance

logbook (Logbook) Uses its log function to log messages

Returns

SystemResponse: The system's response with a least the utterance set

Users

User

src/user.js

Abstract class for simulators of a user of a generative information retrieval system.

Users can be stateful. Calling User#start is equivalent to starting a new conversation. Simple users might reset at the start of that method, whereas others might have a cross-conversation state. In any case, that method must be called at least once before calling User#followUp.

The constructor of a user must have two parameters:

The configuration that has to be passed via super(configuration) and is then available via this.configuration.
A Logbook that can be used to log the initialization process.

new User(configuration: Object)

Parameters

configuration (Object) The configuration for the user

Instance Members

▸ start(topic, logbook)

src/user.js

Starts a new simulation for the specified topic.

Users can be stateful. Calling this method is equivalent to starting a new conversation. Simple users might reset at the start of this method, whereas others might have a cross-conversation state. In any case, this method must be called at least once before calling User#followUp.

start(topic: any, logbook: Logbook): UserTurn

Parameters

topic (any)

logbook (Logbook) Uses its log function to log messages

Properties

topic (Topic) : The topic

Returns

UserTurn: The turn with at least the utterance set

▸ followUp(systemResponse, logbook)

src/user.js

Follows up on a system response to a previous utterance.

Users can be stateful. The method @{link User#start} must be called at least once before calling this method.

followUp(systemResponse: any, logbook: Logbook): UserTurn

Parameters

systemResponse (any)

logbook (Logbook) Uses its log function to log messages

Properties

response (SystemResponse) : The latest response of the system

Returns

UserTurn: The turn with at least the utterance set

StaticUser

src/users/static-user.js

A basic user model that does not change during conversation and only looks at the latest response for following up on it.

new StaticUser(configuration: Object, log: Logbook)

Parameters

configuration (Object) The configuration for the user

Name	Description
configuration.llm `LLMConfiguration`	The configuration for the language model employed during simulation
configuration.start `string`	Template for the prompt to simulate the first message for a topic. Variables: `{{x}}`: A property `x` of the configuration for the user `{{variables.topic}}`: The Topic object
configuration.followUp `string`	Template for the prompt to simulate a follow-up message to a system response. Variables: `{{x}}`: A property `x` of the configuration for the user `{{variables.topic}}`: The Topic object `{{variables.systemResponse}}`: The SystemResponse object of the response to follow-up on

log (Logbook) A function that takes log messages

Touche25RADUser

src/users/touche25-rad-user.js

User model for the Touche 25 Retrieval-Augmented Debating task. A client tor the corresponding server.

new Touche25RADUser(configuration: Object, log: Logbook)

Parameters

configuration (Object) The configuration for the user

Name	Description
configuration.url `string`	The URL of the chat API
configuration.model `string`	The name of the user model

log (Logbook) A function that takes log messages

Utility

templates

src/templates.js

Static methods for filling in text templates.

templates

Static Members

▸ render(text, context, options = undefined)

src/templates.js

Replaces occurrences of {{path.to.variable}} in the text with the corresponding values in the context object (e.g., replace with context["path"]["to"]["variable"]).

If the input is not a string but an object or array, it is recursively cloned and occurences in the contents are replaced. Numbers, boolean, etc. are shallow copied.

render(text: any, context: Object, options: Object?)

Parameters

text (any) The template string or an object or array structure that contains template strings (among others)

context (Object) The values of the variables that can be referenced

options

(Object?
            = undefined)

Replacement options

Name	Description
options.ignoreMissing `boolean?`	Whether to ignore if a reference variable does not exist in the context (not changing the text) instead of throwing an error

▸ joinMessages(messages)

src/templates.js

Converts the messages exchanged with an LLM into a single string.

joinMessages(messages: Array): string

Parameters

messages (Array) Messages for a chat API

Returns

string: The converted messages

▸ joinProperties(object, keys = undefined)

src/templates.js

Converts the properties and values of an object into a single string.

joinProperties(object: Object, keys: Array?): string

Parameters

object (Object) The object to be converted

keys

(Array?
            = undefined)

The names of the properties to convert, if not all (the default)

Returns

string: The converted object

▸ tsv2Contexts(tsv)

src/templates.js

Converts each row of a tab-separated values text (except the header) to a context object.

tsv2Contexts(tsv: string): Array

Parameters

tsv (string) Contents of a tab-separated values file (no quotations), first line is treated as header that specifies the keys and the values in the other lines are then the respective values, each line then being converted to a context object

Returns

Array: Array of the created context objects

Types

Evaluation

src/index.js

Object that represents the evaluation of a simulation.

Evaluation

Type: Object

Properties

configuration (Object) : The configuration of the evaluation

simulation (Simulation) : The simulation that was evaluated

userTurnsEvaluations (Array) : For each user turn of the simulation, in order, an object where the keys are the names of the configured evaluators (if they evaluated the specific turn of the simulation) and the values are the respective EvaluationResult s (and one property, milliseconds gives the time taken for evaluation in milliseconds)

overallEvaluations (Object) : An object where the keys are the names of the configured evaluators (if they evaluated the overall simulation) and the values are the respective EvaluationResult s (and one property, milliseconds gives the time taken for evaluation in milliseconds)

millisecondsEvaluation (number) : Time taken for the evaluation in milliseconds

EvaluationResult

src/evaluator.js

Object returned by Evaluator#evaluate with at least a score.

EvaluationResult

Type: Object

Properties

score (number) : A number between 0 and 1, with higher values indicating better responses

milliseconds (number?) : Time taken for evaluation in milliseconds (this property is automatically added by GenIRSim)

EVALUATION_RESULT

src/evaluator.js

Constants for EvaluationResult property names.

EVALUATION_RESULT

Static Members

▸ SCORE

src/evaluator.js

A number between 0 and 1, with higher values indicating better responses.

SCORE

▸ EXPLANATION

src/evaluator.js

A string explanation for the score.

EXPLANATION

LLM

src/llm.js

A large language model.

new LLM(configuration: LLMConfiguration, logbook: Logbook)

Parameters

configuration (LLMConfiguration) The configuration object

logbook (Logbook) The logbook to log to

Instance Members

▸ createAssistantMessage(message)

src/llm.js

Creates a message object for a message from the assistant.

createAssistantMessage(message: string): Object

Parameters

message (string) The string message that the assistant says

Returns

Object: The message object

▸ createSystemMessage(message)

src/llm.js

Creates a system prompt object.

createSystemMessage(message: string): Object

Parameters

message (string) The string prompt

Returns

Object: The message object

▸ createUserMessage(message)

src/llm.js

Creates a message object for a message from the user.

createUserMessage(message: string): Object

Parameters

message (string) The string message that the user says

Returns

Object: The message object

▸ chat(messages, action)

src/llm.js

Generates a chat completion.

chat(messages: Array, action: string): string

Parameters

messages (Array) The message history for the completion, use LLM#createSystemMessage , LLM#createUserMessage , and LLM#createAssistantMessage to create these

action (string) Name of the action for which the text is generated, used for logging

Returns

string: The completion

▸ json(messages, action, requiredKeys = [], maxRetries = 3)

src/llm.js

Generates a chat completion in JSON format.

json(messages: Array, action: string, requiredKeys: Array?, maxRetries: number?): Object

Parameters

messages (Array) The message history for the completion, use LLM#createSystemMessage , LLM#createUserMessage , and LLM#createAssistantMessage to create these

action (string) Name of the action for which the text is generated, used for logging

requiredKeys

(Array?
            = [])

Names of properties that the parsed JSON completion must have

maxRetries

(number?
            = 3)

Maximum number of times to retry the completion (if it can not be parsed and is missing a required key) before throwing an error

Returns

Object: The completion as parsed object

LLMConfiguration

src/llm.js

Configuration for an LLM.

Properties are url (see below) and all paramters for the chat completion endpoint, which includes the required model, but also optional parameters like options.temperature (see the modelfile parameter of Ollama).

LLMConfiguration

Type: Object

Properties

url (string) : The complete URL of the LLM's chat API endpoint

model (string) : The large language model name as per the API

Logbook

src/logbook.js

A logbook to log actions specific to one source.

new Logbook(source: string, callback: function?, prefix: string?)

Parameters

source (string) The source for which to log entries

callback (function?) An optional function to call with each LogbookEntry created on Logbook#log

prefix (string?) An optional prefix to the action logged

Instance Members

▸ log(action, object?)

src/logbook.js

Logs one entry to the logbook.

log(action: string, object: (Object | string)?): LogbookEntry

Parameters

action (string) The action for which to log

object ((Object | string)?) An optional object or string describing the event that is logged

Returns

LogbookEntry: The logged entry

LogbookEntry

src/logbook.js

One entry for the logbook, issued by the source to log for the action.

new LogbookEntry(source: string, action: string, data: (Object | string)?)

Parameters

source (string) The source that produced this entry

action (string) The action for which this entry was produced

data ((Object | string)?) An optional object or string describing the event that is logged

Instance Members

▸ time

src/logbook.js

The entry's creation date in milliseconds since epoch.

time

Type: number

▸ source

src/logbook.js

The source that produced this entry.

source

Type: string

▸ action

src/logbook.js

The action for which this entry was produced.

action

Type: string

▸ data

src/logbook.js

data

Parameters

data ((Object | string | undefined)) An optional object or string that describes the event that is logged

▸ isContinuationOf(previousEntry)

src/logbook.js

Checks whether this entry is a continuation of the previous entry (both belong to the same action).

isContinuationOf(previousEntry: LogbookEntry): boolean

Parameters

previousEntry (LogbookEntry) The previous entry for the logbook

Returns

boolean: Whether it is

▸ hasContent()

src/logbook.js

Checks whether this entry has some content.

hasContent(): boolean

Returns

boolean: Whether it has

▸ hasTextContent()

src/logbook.js

Checks whether this entry has some text content.

hasTextContent(): boolean

Returns

boolean: Whether it has

▸ getContent()

src/logbook.js

Gets the content as formatted string.

getContent(): string

Returns

string: The entry's content as string

Simulation

src/index.js

Object that represents a completed simulation.

Simulation

Type: Object

Properties

configuration (Object) : The configuration of the simulation

turns (Array) : List of simulated UserTurn s (each one includes the system's response)

milliseconds (number) : Time taken for the simulation in milliseconds

SystemResponse

src/system.js

Object that represents a system's respone to a user's utterance in the simulated conversation with at least the system's utterance.

SystemResponse

Type: Object

Properties

utterance (string) : The utterance of the system

SYSTEM_RESPONSE

src/system.js

Constants for SystemResponse property names.

SYSTEM_RESPONSE

Static Members

▸ UTTERANCE

src/system.js

The utterance displayed by the system back to the user.

UTTERANCE

▸ RESULTS

src/system.js

The results that the system retrieved to answer the user's utterance.

RESULTS

▸ RESULTS_PAGE

src/system.js

The results as one string that represents the results page of the system.

RESULTS_PAGE

Topic

src/index.js

Object that represents a topic (or task, information need).

Topic

Type: Object

Properties

description (string) : A natural language description of the information task to be accomplished

UserTurn

src/user.js

Object that represents a user's turn in the simulated conversation with at least the user's utterance.

UserTurn

Type: Object

Properties

utterance (string) : The simulated utterance sent from the user to the system

systemResponse (SystemResponse) : The response sent from the system to the user as a reply

milliseconds (number?) : Time taken for simulation in milliseconds (this property is automatically added by GenIRSim)

USER_TURN

src/user.js

Constants for UserTurn property names.

USER_TURN

Static Members

▸ UTTERANCE

src/user.js

The simulated utterance sent from the user to the system.

UTTERANCE

▸ SYSTEM_RESPONSE

src/user.js

The SystemResponse sent from the system to the user as a reply.

SYSTEM_RESPONSE