QueryDataset

class lightning_ir.data.dataset.QueryDataset(query_dataset: str, num_queries: int | None = None)[source]

Bases: _IRDataset, _DataParallelIterableDataset

__init__(query_dataset: str, num_queries: int | None = None) None[source]

Dataset containing queries.

Parameters:
  • query_dataset (str) – Path to file containing queries or valid ir_datasets id

  • num_queries (int | None, optional) – Number of queries in dataset. If None, the number of queries will attempted to be inferred, defaults to None

Methods

__init__(query_dataset[, num_queries])

Dataset containing queries.

Attributes

DASHED_DATASET_MAP

Map of dataset names with dashes to dataset names with slashes.

dataset

Dataset name.

dataset_id

Dataset id.

docs

Documents in the dataset.

docs_dataset_id

ID of the dataset containing the documents.

ir_dataset

Instance of ir_datasets.Dataset.

qrels

Qrels in the dataset.

queries

Queries in the dataset.

property DASHED_DATASET_MAP: Dict[str, str]

Map of dataset names with dashes to dataset names with slashes.

Returns:

Dataset map

Return type:

Dict[str, str]

property dataset: str

Dataset name.

Returns:

Dataset name

Return type:

str

property dataset_id: str

Dataset id.

Returns:

Dataset id

Return type:

str

property docs: Docstore | Dict[str, GenericDoc]

Documents in the dataset.

Raises:

ValueError – If no documents are found in the dataset

Returns:

Documents

Return type:

ir_datasets.indices.Docstore | Dict[str, GenericDoc]

property docs_dataset_id: str

ID of the dataset containing the documents.

Returns:

Document dataset id

Return type:

str

property ir_dataset: Dataset | None

Instance of ir_datasets.Dataset.

Returns:

ir_datasets dataset

Return type:

ir_datasets.Dataset | None

property qrels: DataFrame | None

Qrels in the dataset.

Returns:

Qrels

Return type:

pd.DataFrame | None

property queries: Series

Queries in the dataset.

Raises:

ValueError – If no queries are found in the dataset

Returns:

Queries

Return type:

pd.Series