DocSample
- class lightning_ir.data.data.DocSample(doc_id: str, doc: str)[source]
Bases:
object
A sample of document data containing a document and its id.
- Parameters:
doc_id (str) – Id of the document
doc – Document text
:type doc
Methods
__init__
(doc_id, doc)from_ir_dataset_sample
(sample[, text_fields])Create a DocSample from an ir_datasets sample.
Attributes
doc_id
doc
- classmethod from_ir_dataset_sample(sample: GenericDoc, text_fields: Sequence[str] | None = None) DocSample [source]
Create a DocSample from an ir_datasets sample.
- Parameters:
sample (GenericDoc) – ir_datasets sample
text_fields (Sequence[str] | None, optional) – Optional fields to parse the text. If None uses the samples
default_text()
defaults to None
- Returns:
Doc sample
- Return type: