This python tool watches the postgres systemmetadata tables, pulling out entries with dateModified more recent than a specified value.
Identifiers are examined in the Solr index, and flagged if the indexed dateModified does not match that of the systemMetadata.
The process is fairly efficient and may provide a basis for implementing a replacement for the index-task-generator which currently relies on hazelcast events.
Output is to a JSON file that can be rendered with a simple handsontable implemnetation.
There are several approaches to postgres event notification:
- Polling. Periodic query to retrieve a (hopefully) limited set of changes. Implemented in
main.py
. - Listening for
pg_notify
events. Implemented inlisten.py
- Using a postgres extension to call a amqp message queue. Listener in
listenq.py
- Write events to a queue table (e.g. using a trigger) and retrieve events by querying the table using the
SKIP LOCKED
semantics. Not implemented here, some discussion on stack overflow. Simple example on HN.