API Docs¶
Record Indexer¶
API for indexing of records.
-
class
invenio_indexer.api.
BulkRecordIndexer
(search_client=None, exchange=None, queue=None, routing_key=None, version_type=None, record_to_index=None)[source]¶ Provide an interface for indexing records in Elasticsearch.
Uses bulk indexing by default.
Initialize indexer.
Parameters: - search_client – Elasticsearch client.
(Default:
current_search_client
) - exchange – A
kombu.Exchange
instance for message queue. - queue – A
kombu.Queue
instance for message queue. - routing_key – Routing key for message queue.
- version_type – Elasticsearch version type.
(Default:
external_gte
) - record_to_index – Function to extract the index and doc_type from the record.
-
index
(record)[source]¶ Index a record.
The caller is responsible for ensuring that the record has already been committed to the database. If a newer version of a record has already been indexed then the provided record will not be indexed. This behavior can be controlled by providing a different
version_type
when initializingRecordIndexer
.Parameters: record – Record instance.
- search_client – Elasticsearch client.
(Default:
-
class
invenio_indexer.api.
Producer
(channel, exchange=None, routing_key=None, serializer=None, auto_declare=None, compression=None, on_return=None)[source]¶ Producer validating published messages.
For more information visit
kombu.Producer
.
-
class
invenio_indexer.api.
RecordIndexer
(search_client=None, exchange=None, queue=None, routing_key=None, version_type=None, record_to_index=None)[source]¶ Provide an interface for indexing records in Elasticsearch.
Bulk indexing works by queuing requests for indexing records and processing these requests in bulk.
Initialize indexer.
Parameters: - search_client – Elasticsearch client.
(Default:
current_search_client
) - exchange – A
kombu.Exchange
instance for message queue. - queue – A
kombu.Queue
instance for message queue. - routing_key – Routing key for message queue.
- version_type – Elasticsearch version type.
(Default:
external_gte
) - record_to_index – Function to extract the index and doc_type from the record.
-
bulk_delete
(record_id_iterator)[source]¶ Bulk delete records from index.
Parameters: record_id_iterator – Iterator yielding record UUIDs.
-
bulk_index
(record_id_iterator)[source]¶ Bulk index records.
Parameters: record_id_iterator – Iterator yielding record UUIDs.
-
delete
(record, **kwargs)[source]¶ Delete a record.
Parameters: - record – Record instance.
- kwargs – Passed to
elasticsearch.Elasticsearch.delete()
.
-
delete_by_id
(record_uuid, **kwargs)[source]¶ Delete record from index by record identifier.
Parameters: - record_uuid – Record identifier.
- kwargs – Passed to
RecordIndexer.delete()
.
-
index
(record, arguments=None, **kwargs)[source]¶ Index a record.
The caller is responsible for ensuring that the record has already been committed to the database. If a newer version of a record has already been indexed then the provided record will not be indexed. This behavior can be controlled by providing a different
version_type
when initializingRecordIndexer
.Parameters: record – Record instance.
-
index_by_id
(record_uuid, **kwargs)[source]¶ Index a record by record identifier.
Parameters: - record_uuid – Record identifier.
- kwargs – Passed to
RecordIndexer.index()
.
-
mq_exchange
¶ Message Queue exchange.
Returns: The Message Queue exchange.
-
mq_queue
¶ Message Queue queue.
Returns: The Message Queue queue.
-
mq_routing_key
¶ Message Queue routing key.
Returns: The Message Queue routing key.
-
process_bulk_queue
(es_bulk_kwargs=None)[source]¶ Process bulk indexing queue.
Parameters: es_bulk_kwargs (dict) – Passed to elasticsearch.helpers.bulk()
.
-
record_cls
¶ alias of
Record
- search_client – Elasticsearch client.
(Default:
Flask Extension¶
Flask exension for Invenio-Indexer.
Celery tasks¶
Celery tasks to index records.
-
invenio_indexer.tasks.
delete_record
(record_uuid)[source]¶ Delete a single record.
Parameters: record_uuid – The record UUID.
-
invenio_indexer.tasks.
index_record
(record_uuid)[source]¶ Index a single record.
Parameters: record_uuid – The record UUID.
-
invenio_indexer.tasks.
process_bulk_queue
(version_type=None, es_bulk_kwargs=None)[source]¶ Process bulk indexing queue.
Parameters: - version_type (str) – Elasticsearch version type.
- es_bulk_kwargs (dict) – Passed to
elasticsearch.helpers.bulk()
.
Note: You can start multiple versions of this task.
-
invenio_indexer.tasks.
process_bulk_queue
(version_type)[source] Process bulk indexing queue.
Parameters: - version_type (str) – Elasticsearch version type.
- es_bulk_kwargs (dict) – Passed to
elasticsearch.helpers.bulk()
.
Note: You can start multiple versions of this task.
-
invenio_indexer.tasks.
index_record
(record_uuid)[source] Index a single record.
Parameters: record_uuid – The record UUID.
-
invenio_indexer.tasks.
delete_record
(record_uuid)[source] Delete a single record.
Parameters: record_uuid – The record UUID.
Signals¶
Signals for indexer.
-
invenio_indexer.signals.
before_record_index
= <blinker.base.NamedSignal object at 0x7f8803fa7d50; 'before-record-index'>¶ Signal sent before a record is indexed.
The sender is the current Flask application, and two keyword arguments are provided:
json
: The dumped record dictionary which can be modified.record
: The record being indexed.index
: The index in which the record will be indexed.doc_type
: The doc_type for the record.arguments
: The arguments to pass to Elasticsearch for indexing.**kwargs
: Extra arguments.
This signal also has a
.dynamic_connect()
method which allows some more flexible ways to connect receivers to it. The most common use case is that you want to apply a receiver only to a specific index. In that case you can call:For more complex conditions you can provide a function via the
condition_func
parameter like so: