Details
-
Type:
New Feature
-
Status:
Open
-
Priority:
Major
-
Resolution: Unresolved
-
Affects Version/s: 0.8.10
-
Fix Version/s: None
-
Component/s: Chef Server
-
Labels:None
Description
Optimistic locking for databag items would be useful for implementing locking primitives in chef recipes.
Some background: I've been writing a recipe for Elastic Search, a clustered search service that stores index shards in memory and replicates them to other nodes. The non-persisted nature of the index shards means we can't do the usual restart notifications, because if more than one node is restarting concurrently, we could lose data.
What we need is a way to ensure that only one node is ever restarting at a time. This is currently impossible, but could be implemented pretty easily with optimistic locking. Here's one possible implementation:
A "lock" databag item has one attribute: a lock timestamp.
When a client wants to acquire a lock, she gets the lock databag item (along with the item version number), and compares the timestamp to the current time. If the current time is greater than the timestamp, then the client tries to acquire the lock by setting the item timestamp to the current time + 5 minutes (or some other delay). She also sends back the item version number.
If the put succeeds, then the client has the lock and can notify a restart.
If the put fails, then the client waits until the next run to try again.
This ensures that only one client ever holds the lock at once. The upshot is that each restart has 5 minutes to complete before the next one can start.
Seems like optimistic locking would be pretty easy to implement on the server side, since couchdb already supports it. For GETs, return the _rev in the response body if locking is requested, and for POSTs, if _rev is present in the request body, use that when updating couchdb. Return a 409 if the update fails.
Activity
- All
- Comments
- History
- Activity
- Transitions Summary
This would be very useful to start up 3 members of a MongoDB replica set, and then tell exactly one member to initiate the replica set. Otherwise it's hard to do with chef. I imagine many clustering systems with automatic failover, like MongoDB replica sets, would benefit from a locking mechanism in the Chef server.