Hard Stickiness for Actors (Non-durable)#97
Conversation
Signed-off-by: Cassandra Coyle <cassie@diagrid.io>
|
How will Actors redistribute / load balance evenly when more capacity is added? say, a new instance of a deployment comes online? |
|
Given the trade-off, I think I'd really like to see this added to the SDKs as an option when the actor is instantiated so it's only done as an opt-in basis. Given it's just another option to set on the initial call, this seems like a really tiny SDK change, so I'd be in favor of a tweak to the proposal to support across the SDKs. |
|
|
||
| On leader failover, sticky actors remain active, the new Scheduler leader blocks all sticky acquire requests until quorum | ||
| of sidecars reconnect and send claims, then rebuilds table. Only owned active actors are in `sticky_claims`. This | ||
| guarantees zero double activation, as quorum waits for reports before allowing new grants. |
There was a problem hiding this comment.
Just confirming that this means everytime the scheduler leader goes down all sticky actor ids will be deactivated and re-activated potentially on a new daprd, yes? But if a non-leader goes down then they will stay intact on their daprd?
| - Full Scheduler outages reset the in-memory sticky table, where only idle actors lose stickiness until next activation. | ||
| - One extra round trip to Scheduler on first activation for a sticky ID. | ||
| - Keep‑alive stickiness could increase memory/CPU. | ||
| - Leader failover delays new activations until quorum (safe handoff) to avoid double activation like Dapr guarantees. |
There was a problem hiding this comment.
If there is a sudden increase in sticky actors (eg scaling) then new actors can't be activated until they all are registered right?
Does quorum consist of all the sticky actor ids that exist in the placement table?
No description provided.