API Docs¶
Record Indexer¶
API for indexing of records.
-
class
invenio_indexer.api.BulkRecordIndexer(search_client=None, exchange=None, queue=None, routing_key=None, version_type=None, record_to_index=None)[source]¶ Provide an interface for indexing records in Elasticsearch.
Uses bulk indexing by default.
Initialize indexer.
Parameters: - search_client – Elasticsearch client.
(Default:
current_search_client) - exchange – A
kombu.Exchangeinstance for message queue. - queue – A
kombu.Queueinstance for message queue. - routing_key – Routing key for message queue.
- version_type – Elasticsearch version type.
(Default:
external_gte) - record_to_index – Function to extract the index and doc_type from the record.
-
index(record)[source]¶ Index a record.
The caller is responsible for ensuring that the record has already been committed to the database. If a newer version of a record has already been indexed then the provided record will not be indexed. This behavior can be controlled by providing a different
version_typewhen initializingRecordIndexer.Parameters: record – Record instance.
- search_client – Elasticsearch client.
(Default:
-
class
invenio_indexer.api.Producer(channel, exchange=None, routing_key=None, serializer=None, auto_declare=None, compression=None, on_return=None)[source]¶ Producer validating published messages.
For more information visit
kombu.Producer.
-
class
invenio_indexer.api.RecordIndexer(search_client=None, exchange=None, queue=None, routing_key=None, version_type=None, record_to_index=None)[source]¶ Provide an interface for indexing records in Elasticsearch.
Bulk indexing works by queuing requests for indexing records and processing these requests in bulk.
Initialize indexer.
Parameters: - search_client – Elasticsearch client.
(Default:
current_search_client) - exchange – A
kombu.Exchangeinstance for message queue. - queue – A
kombu.Queueinstance for message queue. - routing_key – Routing key for message queue.
- version_type – Elasticsearch version type.
(Default:
external_gte) - record_to_index – Function to extract the index and doc_type from the record.
-
bulk_delete(record_id_iterator)[source]¶ Bulk delete records from index.
Parameters: record_id_iterator – Iterator yielding record UUIDs.
-
bulk_index(record_id_iterator)[source]¶ Bulk index records.
Parameters: record_id_iterator – Iterator yielding record UUIDs.
-
delete(record, **kwargs)[source]¶ Delete a record.
Parameters: - record – Record instance.
- kwargs – Passed to
elasticsearch.Elasticsearch.delete().
-
delete_by_id(record_uuid, **kwargs)[source]¶ Delete record from index by record identifier.
Parameters: - record_uuid – Record identifier.
- kwargs – Passed to
RecordIndexer.delete().
-
index(record, arguments=None, **kwargs)[source]¶ Index a record.
The caller is responsible for ensuring that the record has already been committed to the database. If a newer version of a record has already been indexed then the provided record will not be indexed. This behavior can be controlled by providing a different
version_typewhen initializingRecordIndexer.Parameters: record – Record instance.
-
index_by_id(record_uuid, **kwargs)[source]¶ Index a record by record identifier.
Parameters: - record_uuid – Record identifier.
- kwargs – Passed to
RecordIndexer.index().
-
mq_exchange¶ Message Queue exchange.
Returns: The Message Queue exchange.
-
mq_queue¶ Message Queue queue.
Returns: The Message Queue queue.
-
mq_routing_key¶ Message Queue routing key.
Returns: The Message Queue routing key.
-
process_bulk_queue(es_bulk_kwargs=None)[source]¶ Process bulk indexing queue.
Parameters: es_bulk_kwargs (dict) – Passed to elasticsearch.helpers.bulk().
-
record_cls¶ alias of
Record
- search_client – Elasticsearch client.
(Default:
Flask Extension¶
Flask exension for Invenio-Indexer.
Celery tasks¶
Celery tasks to index records.
-
invenio_indexer.tasks.delete_record(record_uuid)[source]¶ Delete a single record.
Parameters: record_uuid – The record UUID.
-
invenio_indexer.tasks.index_record(record_uuid)[source]¶ Index a single record.
Parameters: record_uuid – The record UUID.
-
invenio_indexer.tasks.process_bulk_queue(version_type=None, es_bulk_kwargs=None)[source]¶ Process bulk indexing queue.
Parameters: - version_type (str) – Elasticsearch version type.
- es_bulk_kwargs (dict) – Passed to
elasticsearch.helpers.bulk().
Note: You can start multiple versions of this task.
-
invenio_indexer.tasks.process_bulk_queue(version_type)[source] Process bulk indexing queue.
Parameters: - version_type (str) – Elasticsearch version type.
- es_bulk_kwargs (dict) – Passed to
elasticsearch.helpers.bulk().
Note: You can start multiple versions of this task.
-
invenio_indexer.tasks.index_record(record_uuid)[source] Index a single record.
Parameters: record_uuid – The record UUID.
-
invenio_indexer.tasks.delete_record(record_uuid)[source] Delete a single record.
Parameters: record_uuid – The record UUID.
Signals¶
Signals for indexer.
-
invenio_indexer.signals.before_record_index= <blinker.base.NamedSignal object at 0x7f8803fa7d50; 'before-record-index'>¶ Signal sent before a record is indexed.
The sender is the current Flask application, and two keyword arguments are provided:
json: The dumped record dictionary which can be modified.record: The record being indexed.index: The index in which the record will be indexed.doc_type: The doc_type for the record.arguments: The arguments to pass to Elasticsearch for indexing.**kwargs: Extra arguments.
This signal also has a
.dynamic_connect()method which allows some more flexible ways to connect receivers to it. The most common use case is that you want to apply a receiver only to a specific index. In that case you can call:For more complex conditions you can provide a function via the
condition_funcparameter like so: