Zookeeper

We use zookeeper to track locks between the autonomous components. To make zookeeper easier to use, we use the netflix curator api http://curator.incubator.apache.org/

Zookeeper maintains a tree structure internally. In each leaf in the tree should correspond to some real world thing. A leaf can be locked, and the system is based on the idea that you will lock the leaf before accessing the thing it represents.

We have two "kinds" of leafs.

The specific lock paths are as follows

SBOI lock = "/SBOI/" + runnable.getComponentName();

batch lock = "/"+runnable.getComponentName() + "/B" + batch.getBatchID() + "-RT" + batch.getRoundTripNumber();

Autonomous component locking procedure 1

Each of the autonomous components will exist as cron jobs. Periodically, often every minute or thereabouts, the cron job will activate. It will then follow the procedure below.

Lock "SBOI+component name"
1. query SBOI for batches
2. For each returned batch
3. Attempt to lock the "batch+component name"
4. if successful, break loop
unlock "SBOI+component name"
Do work on locked batch (if any)
store results in DOMS
unlock "batch+component name"

Autonomous component locking procedure 2

Each of the autonomous components will exist as cron jobs. Periodically, often every minute or thereabouts, the cron job will activate. It will then follow the procedure below.

Lock "SBOI+component name"
query SBOI for batches
For each returned batch
1. Attempt to lock the "batch+component name"
2. Do work on locked batch (if any). Possibly spawn sub process
3. store results in DOMS
4. unlock "batch+component name"
unlock "SBOI+component name"

Which one will we use?

Method 1 is simpler to implement (and already done). Method 1 will only maintain a very shortlived lock on SBOI. To run method 1 on three batches, just start it three times.

Method 2 is conceptually simpler to understand. It allows for developer controlled degree of parallelism, but will also force us to code this ourselves.

Zookeeper lock server for the autonomous components

Zookeeper

Autonomous component locking procedure 1

Autonomous component locking procedure 2

Which one will we use?