pybool_ir.experiments.retrieval#
Classes and methods for running retrieval experiments.
Functions
|
Unlike the RetrievalExperiment class, which expects a Collection object, this class allows for ad-hoc queries to be run, for example: |
Classes
|
Basic wrapper around a lucene index that provides a simple interface for searching. |
|
This class provides a convenient interface for running retrieval experiments. |
- pybool_ir.experiments.retrieval.AdHocExperiment(indexer: ~pybool_ir.index.index.Indexer, raw_query: str | None = None, query_parser: ~pybool_ir.query.parser.QueryParser = <pybool_ir.query.pubmed.parser.PubmedQueryParser object>, date_from='1900/01/01', date_to='3000/01/01', ignore_dates: bool = False, date_field: str = 'dp') RetrievalExperiment #
Unlike the RetrievalExperiment class, which expects a Collection object, this class allows for ad-hoc queries to be run, for example:
>>> from pybool_ir.experiments.retrieval import AdHocExperiment >>> from pybool_ir.index.pubmed import PubmedIndexer >>> >>> with AdHocExperiment(PubmedIndexer(index_path="pubmed"), raw_query="headache[tiab]") as experiment: >>> print(experiment.count())
- class pybool_ir.experiments.retrieval.LuceneSearcher(indexer: Indexer)#
Bases:
ABC
Basic wrapper around a lucene index that provides a simple interface for searching. This class can be used as a context manager, which will automatically open and close the index. It is possible to directly use this class to do experiments, but the other classes in this module provide a more convenient interface.
- index: Indexer#
The underlying lucene index.
- indexer#
The pybool_ir pybool_ir.index.index.Indexer class that is used to open the index.
- class pybool_ir.experiments.retrieval.RetrievalExperiment(indexer: ~pybool_ir.index.index.Indexer, collection: ~pybool_ir.experiments.collections.Collection, query_parser: ~pybool_ir.query.parser.QueryParser = <pybool_ir.query.pubmed.parser.PubmedQueryParser object>, eval_measures: ~typing.List[~ir_measures.measures.base.Measure] | None = None, run_path: ~pathlib.Path | None = None, filter_topics: ~typing.List[str] | None = None, ignore_dates: bool = False, date_field: str = 'dp')#
Bases:
LuceneSearcher
This class provides a convenient interface for running retrieval experiments. It can be used as a context manager, which will automatically open and close the index. This class should be used for simple experiments where a collection of queries is executed on the index, for example:
>>> from pybool_ir.experiments.collections import load_collection >>> from pybool_ir.experiments.retrieval import RetrievalExperiment >>> from pybool_ir.index.pubmed import PubmedIndexer >>> from ir_measures import * >>> import ir_measures >>> >>> # Automatically downloads, then loads this collection. >>> collection = load_collection("ielab/sysrev-seed-collection") >>> # Point the experiment to your index, your collection. >>> with RetrievalExperiment(PubmedIndexer(index_path="pubmed"), ... collection=collection) as experiment: ... # Get the run of the experiment. ... # This automatically executes the queries. ... run = experiment.run >>> # Evaluate the run using ir_measures. >>> ir_measures.calc_aggregate([SetP, SetR, SetF], collection.qrels, run)
- go() None #
Run the experiment without returning anything.