Model Zoo
The following table lists models from the HuggingFace Model Hub that are supported in Lightning IR. For each model, the table reports the re-ranking effectiveness in terms of nDCG@10 on the officially released run files containing 1,000 passages for TREC Deep Learning 2019 and 2020.
Native models were fine-tuned using Lightning IR and the model’s HuggingFace model card provides Lightning IR configurations for reproduction. Non-native models were fine-tuned externally but are supported in Lightning IR for inference.
Reproduction
The following command and configuration can be used to reproduce the results:
config.yaml
trainer:
logger: false
enable_checkpointing: false
model:
class_path: CrossEncoderModule # for cross-encoders
# class_path: BiEncoderModule # for bi-encoders
init_args:
model_name_or_path: {MODEL_NAME}
evaluation_metrics:
- nDCG@10
data:
class_path: LightningIRDataModule
init_args:
inference_datasets:
- class_path: RunDataset
init_args:
run_path_or_id: msmarco-passage/trec-dl-2019/judged
- class_path: RunDataset
init_args:
run_path_or_id: msmarco-passage/trec-dl-2020/judged
lightning-ir re_rank --config config.yaml
Model Name |
Native |
TREC DL 2019 |
TREC DL 2020 |
---|---|---|---|
Cross-Encoders |
|||
✅ |
0.751 |
0.769 |
|
✅ |
0.750 |
0.791 |
|
❌ |
0.723 |
0.714 |
|
❌ |
0.720 |
0.728 |
|
❌ |
0.726 |
0.752 |
|
❌ |
0.734 |
0.745 |
|
❌ |
0.737 |
0.759 |
|
❌ |
0.721 |
0.776 |
|
Bi-Encoders |
|||
✅ |
0.711 |
0.714 |
|
❌ |
0.705 |
0.735 |
|
✅ |
0.751 |
0.749 |
|
❌ |
0.732 |
0.746 |
|
✅ |
0.736 |
0.723 |
|
❌ |
0.715 |
0.749 |