pybool_ir.experiments.collections#
Classes and methods for loading collections.
Functions
|
Given the name of a collection, load it from disk. |
Load a collection from the ir_datasets package. |
|
|
Helper function that parses a topic from the CLEF TAR collection. |
Classes
|
A collection contains a list of topics and a list of qrels. |
|
A topic contains a query and a date range for reproducing when the query was issued. |
- class pybool_ir.experiments.collections.Collection(identifier: str, topics: List[Topic], qrels: List[Qrel])#
Bases:
object
A collection contains a list of topics and a list of qrels.
- classmethod from_dir(collection_path: Path) Collection #
Internally, pybool_ir stores collections as a directory with a topics.jsonl file and a qrels file. This ensures a common format for all collections. This method loads a collection in this format.
- class pybool_ir.experiments.collections.Topic(identifier: str, description: str, raw_query: str, date_from: str, date_to: str)#
Bases:
object
A topic contains a query and a date range for reproducing when the query was issued.
- pybool_ir.experiments.collections.load_collection(name: str) Collection #
Given the name of a collection, load it from disk. A collection contains a list of topics and a list of qrels. The actual documents for a collection are handled separately.
- pybool_ir.experiments.collections.load_collection_ir_datasets(name: str) Collection #
Load a collection from the ir_datasets package.
- pybool_ir.experiments.collections.parse_clef_tar_topic(topic_str: str, date_from: str = '1940', date_to: str = '2017', parse_query: bool = False) Topic #
Helper function that parses a topic from the CLEF TAR collection. These files are in a non-standard TREC format, so this function is used to parse them.