From b8fc9f9ce433a09d0c9d42f11ea17c25f7b12f21 Mon Sep 17 00:00:00 2001 From: Whit Waldo Date: Sun, 5 Jan 2025 16:50:59 -0600 Subject: [PATCH 1/6] Added first draft to centralized cache proposal Signed-off-by: Whit Waldo --- 20250105-BCRS-centralized-cache.md | 215 +++++++++++++++++++++++++++++ 1 file changed, 215 insertions(+) create mode 100644 20250105-BCRS-centralized-cache.md diff --git a/20250105-BCRS-centralized-cache.md b/20250105-BCRS-centralized-cache.md new file mode 100644 index 0000000..d19cb8f --- /dev/null +++ b/20250105-BCRS-centralized-cache.md @@ -0,0 +1,215 @@ +# Centralized Cache Building Block +- Author: Whit Waldo (@whitwaldo) +- Updated: 2025-01-05 + +## Overview +This is the first of [many proposals](https://github.com/dapr/dapr/issues/7339) to overhaul the state building block. Rather than continue to iterate on the existing state building block and potentially disrupt existing users of those capabilities, I propose a clean break towards purpose-specific second-generation State Management building blocks in that the current generation can continue to be maintained for the foreseeable future, but that new work is focused on these newer blocks that seek less to adopt a kitchen-sink approach and instead are narrowly tailored for their use cases and purposes. + +This proposal focuses on the implementation of a centralized cache store. While both this proposal and the existing state API are both designed around key/value stores, the proposed shape of this API different from the existing state store in subtle, but helpful ways and makes for some low-hanging fruit to get the ball rolling on these replacement opportunities. As it's a clean break from past decisions, it has the benefit of being a fresh start that favors a concise, simple API that addresses cache-specific functionality and nothing more. Should a developer want something more from the API such as an ability to list existing keys or query the values with some constraint, they're encouraged to turn to other specialized stores for that functionality such as my proposed rethink of a [Key/Value Store](https://github.com/dapr/dapr/issues/7338) or the propsoed [Document Store](https://github.com/dapr/dapr/issues/5146), respectively. + +## What's the purpose of a cache? +As this is a purpose-built implementation, I wanted to take a moment to specifically call out the problem this block seeks to provide a solution for. + +A cache should temporarily store some data (typically something not too large) that's associated with a key for some specified amount of time. While there are variations of the concept that distribute the cache across multiple points of failure (introducing the CAP theorem trade-off of limiting consistency in favor of availability) or that implement a local in-memory cache accessible to perhaps just a single class, this proposal seeks to augment and utilize Dapr's rich existing infrastructure to offer these caching capabilities for a single centralized provider. In other words, this proposal is specifically a "Centralized Cache" proposal and not one trying to provide a distributed cache via the sidecar runtime (though this could just as easily be the focus of another specialized state proposal). + +## Why not use the existing state store? + +Again, this is the first of many more targeted proposals, and as such bears some of the greatest resemblance to what we already have, but it does differ in subtle ways at the API level (addressing issues raised in the past at the SDK level) and targets a narrower set of valid providers. It makes sense to me that the existing state store be rebranded as a general purpose store that's still more than valid for broad generalized state management support, but that this block be positioned as the first of many specialized and more specifically built solutions for common scenarios. + +Here, we're seeking to target an ever-so-slightly diferent piece of the key/value store pie in that we're targeting those developer who expect to have small chunks of data expire over time. They know the keys they're intending to engage with they have specific ideas around the expiration strategies they want to exploint, meaning that this block will not support the [list operations](https://github.com/dapr/proposals/pull/61) offered in a separate proposal. + +Caches are often designed for rapid retrieval and as such are often hosted in-memory, meaning that developers aren't expecting to store large values alongside their keys. As such, this proposal doesn't support streaming operations but will instead set and return whole values and the documentation should call out recommendations as to the maximum size of values being stored. + +Similarly, there's no need for ETag support or other consistency operations. While that might make sense in the context of another key/value store scenario or a document store (e.g. scenarios where the values are potentially quite large), it simply doesn't fit with the narrow API envisioned for this block. Transactions will not be supported as they aren't common to cache implementations as a whole and as such, this will not be a suitable foundation for actors or workflows to build atop of. + +So what's left? Again, a narrowly defined state store that is compatible with the Dapr resiliency capabilities, but which offers the ability to get and set values for known keys supporting both sliding and absolute expiration, and precious little more. + +## Component YAML +This component is expected to have similar attributes to existing state stores with some variation: + +```yaml +apiVersion: dapr.io/v1alpha1 +kind: Component +metadata: + name: cache +spec: + type: centralizedcache. + version: v1 + metadata: + # Various properties necessary for component registration +``` + +## Implementation Overview +There are a few guiding principles I've stuck to while thinking through the shape of this proposal (iterated on in [this issue](https://github.com/dapr/dapr/issues/7886) to date): +- All SDK operations should be implemented asynchronously in a way that minimize operations with the sidecar. +- The key should be represented as a `string`, but the value should be persisted as a `byte[]` as it's not going to be separately queried and would be retrieved in full. This also leaves as an exercise to the developer or SDK precisely how serialization is handled (e.g. serialization strategy, formatting, encryption, compression, encoding, etc.). Ideally, ths various SDK maintainers can develop a uniform approach to this (a POC exists in the .NET SDK [here](https://github.com/dapr/dotnet-sdk/pull/1378)) so that data encoded in C# is easily decoded from any other language SDKs. +- No additional metadata should be stored alongside the value (e.g. minimize multiple operations), but should instead be bundled with the payload if necessary. This store is simply an abstraction for storing keys with values and facilitating storage and retrieval - nothing more.. from the SDK side. + +That said, from an implementation perspective, I leave it as an open question in this proposal whether we want to also persist the sliding timespan in an adjacent key (e.g. `-sliding`) so accommodate sliding expirations where it might not be implemented in the underlying provider. Further, while my overarching goal is to minimize having functionality implemented by Dapr and instead heavily rely on the native providers capabilities alone, the distributed scheduler could provide this sliding and absolute TTL support where underlying providers don't have such support, broadening the number of providers that could be used with our cache abstraction, so there's some minimal wiggle room on that. That said, such variations should be minimal and undertaken only where the support coincides with the goals of this block and not just to add more functionality just because, as that doesn't align with the purpose-driven philosophy. + +### Expiration Strategy +The expiry should be expressed in [ISO 8601 duration format](https://en.wikipedia.org/wiki/ISO_8601#Durations) and will reflect both/either a sliding and/or an absolute expiration as desired. A sliding expiration puts a bounds on the inactivity of any given state before it's removed, but resets this expiration value upon access (e.g. a value is set at noon with a sliding expiration of 2 hours meaning it expires at 2:00 PM. At 1:00 PM, the value is accessed, meaning that the sliding expiration is reset once again to 2 hours later, expiring now at 3:00 PM). An absolute expiration indicates the optional maximum period a value should be cached and overrides any sliding expiration value, when set. For example here, if the absolute expiration is specified at an hour, the value will be removed and not returned even if the sliding expiration would not have yet elapsed. + +### gRPC APIs +In the Dapr gRPC APIs, we'd extend the `runtime.v1.Dapr` service to add new methods: + +| Note: APIs will have Alpha1 suffixed to the type names while in preview + +| Note: Any authentication behaviors are maintained in the component YAML configuration + +```proto +//Existing Dapr service +service Dapr { + // Sets or updates a cache entry + rpc SetCentralizedCacheAlpha1(SetCentralizedCacheRequest) returns (SetCentralizedCacheResponse) {} + + // Retrieves a previously set and unexpired cache entry + rpc GetCentralizedCacheAlpha1(GetCentralizedCacheRequest) returns (GetCentralizedCacheResponse) {} + + // Refreshes the sliding expiration for an existing cache entry + rpc RefreshCentralizedCacheAlpha1(RefreshCentralizedCacheRequest) returns (RefreshCentralizedCacheResponse) {} + + // Removes an existing cache entry + rpc RemoveCentralizedCacheAlpha1(RemoveCentralizedCacheRequest) returns (RemoveCentralizedCacheResponse) {} +} + +// Sets a cache entry in the store +mesage SetCentralizedCacheRequest { + // The value to persist + google.protobuf.Any value = 1; + // Presented in ISO 8601 duration format + string slidingExpiration = 2; + // Presented in ISO 8601 duration format + optional string absoluteExpiration = 3; + // Optional: Indicates whether an existing key and expiry information should be overwritten (true) or not (false) with the new value and expiry information. Defaults to true. + optional boolean overwrite = 4; +} + +// The expected response from an attempt to set a cache entry in the store +message SetCentralizedCacheResponse { + // Indicates whether a key already exists - if `overwrite` was false on the request, this tells the caller that the new value was not persisted. If `overwrite` was true, this just tells the caller an overwrite occured. + boolean exists; +} + +// The payload used to retrieve a stored and unexpired cahe entry +message GetCentralizedCacheRequest { + // The key being accessed + string key; +} + +// Represents the returned value and expiration properties for a given cache entry +message CentralizedCacheValue { + // Contains the value of the specified key. + byte[] value = 1; + // ISO 8601 duration format, updated to reflect the latest value post-touch + string slidingExpiration = 2; + // ISO 8601 duration format + optional string absoluteExpiration = 3; +} + +// Response from an attempt to retrieve a stored and unexpired value +message GetCentralizedCacheResponse { + // Nullable: If null, the value does not exist or is expired. Otherwise, contains the latest entry value and updated expiration values. + CentralizedCacheValue data +} + +// Used to refresh the sliding expiration value for a cache entry +message RefreshCentralizedCacheRequest { + // The key to refresh the sliding expiration for + string key +} + +// Represents the returned expiration properties for a given cache entry, absent the entry value +message RefreshCentralizedCacheValue { + // ISO 8601 duration format, updated to reflect the latest value post-touch + string slidingExpiration = 2; + // ISO 8601 duration format + optional string absoluteExpiration = 3; +} + +// Reflects whether a given key existed to be refreshed or not. +message RefreshCentralizedCacheResponse { + // Nullable: If null, the value doesn't exist or is already expired. Otherwise, contains the updated expiration values for the cache entry. + RefreshCentralizedCacheValue data; +} + +// Used to remove the value for a specified key +message RemoveCentralizedCacheRequest { + // The key to remove + string key +} +``` + +### HTTP APIs +The HTTP APIs align with the shapes of the gRPC APIs. + +| Note: The following URLs assume that `state` is actually used as a root path for the URL so this and other specialized stores continue to be regarded under the state umbrella. + +#### Add a value to the specified key +POST `http://localhost:{daprPort}/v1.0-alpha1/state/centralizedcache/{key} +Payload: +```json +{ + 'overwrite': boolean, + 'value': byte[], + 'slidingExpiration': string, + 'absoluteExpiration': string //Optional +} +``` +Response: +```json +{ + 'exists': boolean // True if there was already an existing entry before the operation, false if there is no existing entry +} +``` + +#### Try to retrieve the value associated with the given key +GET `http://localhost:{daprPort}/v1.0-alpha1/state/centralizedcache/{key} +Response: +```json +{ + 'data': { + 'value': byte[], + 'absoluteExpiration': string, + 'slidingExpiration': string + } | nullable +} +``` + +#### Refresh the cache entry +HEAD `http://localhost:{daprPort}/v1.0-alpha1/state/centralizedcache/{key} +```json +{ + 'data': { + 'absoluteExpiration': string, + 'slidingExpiration': string + } | nullable +} +``` + +#### Remove cache entry +DELETE `http://localhost:{daprPort}/v1.0-alpha1/state/centralizedcache/{key}` + +| Note URLs will reflect the `/v1.0-alpha1/` version while in preview. + +## SDK Example - .NET +The following reflects what the API would look like in the .NET SDK. + +| Method Signature | Description | +| -- | -- | +| Task TryAddAsync(string key, ReadOnlyMemory value, CentralizedCacheOptions options, CancellationToken cancellationToken) | Adds the specified key and value to the store along with a sliding expiration represented as a `TimeSpan` (relative to now) and an optional absolute expiration also represented as a `TimeSpan` (relative to now) or a `DateTimeOffset` (absolute and converted by the SDK to a relative duration). | +| Task AddAsync(string key, ReadOnlyMemory value, CentralizedCacheOptions options, CancellationToken cancellationToken) | Adds the specified key and value to the store along with the expiration options provided in `TryAddAsync` and overwrites any existing information. | +| Task TryGetAsync(string key, out CacheEntry value, CancellationToken cancellationToken) | Attempts to retrieve the value for the specified key. If the key is available, returns true and populates the `out` value with the value and expiration properties. If the key is not available, returns false. | +| Task TryRefreshAsync(string key, out CacheProperties valuecancellationToken, CancellationToken cancellationToken) | Returns true if an unexpired key is found and false if the key is not found. If found, returns the expiration values as the `out` value. | +| Task TryRemoveAsync(string key, CancellationToken cancellationToken) | Attempts to remove the entry with the specified key from the store. | + +Each of these methods supports a cancellation token because while it's not expected that the methods would take any substantial amount of time to complete (as that would be antithetical to the purpose of these component), it facilitates future-proofing in case the Dapr runtime supports cancellation tokens in the future, it makes the methods consistent with the expected shape of other C# async methods, it facilitates testing to enable users to simulate cancellation scenarios themselves and it allows for the operation to be canceled by the client in case of an unexpected timeout with call to the Dapr runtime. + +## Implications +- Outside of augmenting documentation accordingly, no immediate implications of implementing this updated state store mechanism. In the longer run, as this and other state store blocks are added and mature, it might be worth revsiting completely deprecating the existing general-purpose state store, but that's well beyond the current horizon. + +## Completion Checklist +[] Centralized cache API code +[] Tests added (e2e, unit) +[] SDK changes +[] Documentation \ No newline at end of file From 979fd06a06ddb0d2f8376a9af7a000c0b8d54d58 Mon Sep 17 00:00:00 2001 From: Whit Waldo Date: Sun, 5 Jan 2025 18:00:07 -0600 Subject: [PATCH 2/6] Replaced byte[] with google.protobuf.Any in proto docs Signed-off-by: Whit Waldo --- 20250105-BCRS-centralized-cache.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/20250105-BCRS-centralized-cache.md b/20250105-BCRS-centralized-cache.md index d19cb8f..98d5dff 100644 --- a/20250105-BCRS-centralized-cache.md +++ b/20250105-BCRS-centralized-cache.md @@ -100,7 +100,7 @@ message GetCentralizedCacheRequest { // Represents the returned value and expiration properties for a given cache entry message CentralizedCacheValue { // Contains the value of the specified key. - byte[] value = 1; + google.protobuf.Any value = 1; // ISO 8601 duration format, updated to reflect the latest value post-touch string slidingExpiration = 2; // ISO 8601 duration format From f98f53828dc9d8566d8876fb239a1057af2e878d Mon Sep 17 00:00:00 2001 From: Whit Waldo Date: Sun, 5 Jan 2025 18:08:01 -0600 Subject: [PATCH 3/6] Added more to the completion checklist Signed-off-by: Whit Waldo --- 20250105-BCRS-centralized-cache.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/20250105-BCRS-centralized-cache.md b/20250105-BCRS-centralized-cache.md index 98d5dff..61fab8b 100644 --- a/20250105-BCRS-centralized-cache.md +++ b/20250105-BCRS-centralized-cache.md @@ -203,6 +203,8 @@ The following reflects what the API would look like in the .NET SDK. | Task TryRefreshAsync(string key, out CacheProperties valuecancellationToken, CancellationToken cancellationToken) | Returns true if an unexpired key is found and false if the key is not found. If found, returns the expiration values as the `out` value. | | Task TryRemoveAsync(string key, CancellationToken cancellationToken) | Attempts to remove the entry with the specified key from the store. | +The `CentralizedCacheOptions` + Each of these methods supports a cancellation token because while it's not expected that the methods would take any substantial amount of time to complete (as that would be antithetical to the purpose of these component), it facilitates future-proofing in case the Dapr runtime supports cancellation tokens in the future, it makes the methods consistent with the expected shape of other C# async methods, it facilitates testing to enable users to simulate cancellation scenarios themselves and it allows for the operation to be canceled by the client in case of an unexpected timeout with call to the Dapr runtime. ## Implications @@ -212,4 +214,5 @@ Each of these methods supports a cancellation token because while it's not expec [] Centralized cache API code [] Tests added (e2e, unit) [] SDK changes -[] Documentation \ No newline at end of file +[] Identify which providers support this component +[] Documentation (supported providers, purpose, set up structure for future specialty state stores) \ No newline at end of file From df8db441dc5255d31c80f218b9d7a50b3b52dc8c Mon Sep 17 00:00:00 2001 From: Whit Waldo Date: Sun, 5 Jan 2025 18:32:16 -0600 Subject: [PATCH 4/6] Added more formatting and type explanation Signed-off-by: Whit Waldo --- 20250105-BCRS-centralized-cache.md | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/20250105-BCRS-centralized-cache.md b/20250105-BCRS-centralized-cache.md index 61fab8b..f31fda6 100644 --- a/20250105-BCRS-centralized-cache.md +++ b/20250105-BCRS-centralized-cache.md @@ -197,13 +197,17 @@ The following reflects what the API would look like in the .NET SDK. | Method Signature | Description | | -- | -- | -| Task TryAddAsync(string key, ReadOnlyMemory value, CentralizedCacheOptions options, CancellationToken cancellationToken) | Adds the specified key and value to the store along with a sliding expiration represented as a `TimeSpan` (relative to now) and an optional absolute expiration also represented as a `TimeSpan` (relative to now) or a `DateTimeOffset` (absolute and converted by the SDK to a relative duration). | -| Task AddAsync(string key, ReadOnlyMemory value, CentralizedCacheOptions options, CancellationToken cancellationToken) | Adds the specified key and value to the store along with the expiration options provided in `TryAddAsync` and overwrites any existing information. | -| Task TryGetAsync(string key, out CacheEntry value, CancellationToken cancellationToken) | Attempts to retrieve the value for the specified key. If the key is available, returns true and populates the `out` value with the value and expiration properties. If the key is not available, returns false. | -| Task TryRefreshAsync(string key, out CacheProperties valuecancellationToken, CancellationToken cancellationToken) | Returns true if an unexpired key is found and false if the key is not found. If found, returns the expiration values as the `out` value. | -| Task TryRemoveAsync(string key, CancellationToken cancellationToken) | Attempts to remove the entry with the specified key from the store. | +| `Task` TryAddAsync(`string` key, `ReadOnlyMemory` value, `CentralizedCacheOptions` options, `CancellationToken` cancellationToken) | Adds the specified key and value to the store along with a sliding expiration represented as a `TimeSpan` (relative to now) and an optional absolute expiration also represented as a `TimeSpan` (relative to now) or a `DateTimeOffset` (absolute and converted by the SDK to a relative duration). | +| `Task` AddAsync(`string` key, `ReadOnlyMemory` value, `CentralizedCacheOptions` options, `CancellationToken` cancellationToken) | Adds the specified key and value to the store along with the expiration options provided in `TryAddAsync` and overwrites any existing information. | +| `Task` TryGetAsync(`string` key, out `CacheEntry` value, `CancellationToken` cancellationToken) | Attempts to retrieve the value for the specified key. If the key is available, returns true and populates the `out` value with the value and expiration properties. If the key is not available, returns false. | +| `Task` TryRefreshAsync(`string` key, out `CacheProperties` value, cancellationToken, `CancellationToken` cancellationToken) | Returns true if an unexpired key is found and false if the key is not found. If found, returns the expiration values as the `out` value. | +| `Task` TryRemoveAsync(string key, CancellationToken cancellationToken) | Attempts to remove the entry with the specified key from the store. | -The `CentralizedCacheOptions` +The `CentralizedCacheOptions` contains the required sliding expiration `TimeSpan` as well as an optional absolute expiration `TimeSpan`. + +A `CacheEntry` contains the value of the cache entry, as well as the current expiration values, as applicable, for both the sliding and optional absolute expirations. Either one should be presented as a `TimeSpan` and the SDK can handle exposing either as an absolute `DateTime` offset to UTC as a read-only property. + +A `CacheProperties` caontains current expiration values, as applicable, for both the sliding and optional absolute expirations. Either one should be presented as a `TimeSpan` and the SDK can handle exposing either as an absolute `DateTime` offset to UTC as a read-only property. In other words, it returns the same properties as a `CacheEntry`, but without the value. Each of these methods supports a cancellation token because while it's not expected that the methods would take any substantial amount of time to complete (as that would be antithetical to the purpose of these component), it facilitates future-proofing in case the Dapr runtime supports cancellation tokens in the future, it makes the methods consistent with the expected shape of other C# async methods, it facilitates testing to enable users to simulate cancellation scenarios themselves and it allows for the operation to be canceled by the client in case of an unexpected timeout with call to the Dapr runtime. From e77e6f02a6ec12670d71a0c1b4b565d9b9dacfda Mon Sep 17 00:00:00 2001 From: Whit Waldo Date: Sun, 5 Jan 2025 20:01:38 -0600 Subject: [PATCH 5/6] Added open question about a Dapr-managed TTL via Jobs API Signed-off-by: Whit Waldo --- 20250105-BCRS-centralized-cache.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/20250105-BCRS-centralized-cache.md b/20250105-BCRS-centralized-cache.md index f31fda6..25ef3eb 100644 --- a/20250105-BCRS-centralized-cache.md +++ b/20250105-BCRS-centralized-cache.md @@ -214,6 +214,10 @@ Each of these methods supports a cancellation token because while it's not expec ## Implications - Outside of augmenting documentation accordingly, no immediate implications of implementing this updated state store mechanism. In the longer run, as this and other state store blocks are added and mature, it might be worth revsiting completely deprecating the existing general-purpose state store, but that's well beyond the current horizon. +## Open Questions +### Dapr-managed TTL +There are several of the state proposals (including the [object store proposal here](https://github.com/dapr/proposals/pull/18)) that could benefit from having TTL support. While both Azure Blob Store and Amazon S3 offer TTL support via lifecycle policies, I wonder to what extent it could make sense to have one uniform internal mechanism that provides TTL on top of the new scheduler API and whether that could broaden the number of providers available both to this cache proposal, but also that separate object store proposal in that TTL could be implemented as a future job that, when triggered, invokes a deletion operation. Or as in this proposal, during a refresh, simply updates the job details to trigger at a later point. This is presumably done by actors (and in turn by workflows) today, so this might be an aspect of this and other building blocks that Dapr handles instead of offloading to the providers (and thus making the Dapr-implemented API more complex for each provider - e.g. adding lifecycle management support just for TTL needs on both Azure and Amazon's clouds). + ## Completion Checklist [] Centralized cache API code [] Tests added (e2e, unit) From d623f0a228aa99b547bdf0c864de1c5f42510b05 Mon Sep 17 00:00:00 2001 From: Whit Waldo Date: Mon, 6 Jan 2025 00:40:19 -0600 Subject: [PATCH 6/6] Fixed spelling typos, added an argument about maintenance timing Signed-off-by: Whit Waldo --- 20250105-BCRS-centralized-cache.md | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/20250105-BCRS-centralized-cache.md b/20250105-BCRS-centralized-cache.md index 25ef3eb..6042ac9 100644 --- a/20250105-BCRS-centralized-cache.md +++ b/20250105-BCRS-centralized-cache.md @@ -5,7 +5,7 @@ ## Overview This is the first of [many proposals](https://github.com/dapr/dapr/issues/7339) to overhaul the state building block. Rather than continue to iterate on the existing state building block and potentially disrupt existing users of those capabilities, I propose a clean break towards purpose-specific second-generation State Management building blocks in that the current generation can continue to be maintained for the foreseeable future, but that new work is focused on these newer blocks that seek less to adopt a kitchen-sink approach and instead are narrowly tailored for their use cases and purposes. -This proposal focuses on the implementation of a centralized cache store. While both this proposal and the existing state API are both designed around key/value stores, the proposed shape of this API different from the existing state store in subtle, but helpful ways and makes for some low-hanging fruit to get the ball rolling on these replacement opportunities. As it's a clean break from past decisions, it has the benefit of being a fresh start that favors a concise, simple API that addresses cache-specific functionality and nothing more. Should a developer want something more from the API such as an ability to list existing keys or query the values with some constraint, they're encouraged to turn to other specialized stores for that functionality such as my proposed rethink of a [Key/Value Store](https://github.com/dapr/dapr/issues/7338) or the propsoed [Document Store](https://github.com/dapr/dapr/issues/5146), respectively. +This proposal focuses on the implementation of a centralized cache store. While both this proposal and the existing state API are both designed around key/value stores, the proposed shape of this API different from the existing state store in subtle, but helpful ways and makes for some low-hanging fruit to get the ball rolling on these replacement opportunities. As it's a clean break from past decisions, it has the benefit of being a fresh start that favors a concise, simple API that addresses cache-specific functionality and nothing more. Should a developer want something more from the API such as an ability to list existing keys or query the values with some constraint, they're encouraged to turn to other specialized stores for that functionality such as my proposed rethink of a [Key/Value Store](https://github.com/dapr/dapr/issues/7338) or the proposed [Document Store](https://github.com/dapr/dapr/issues/5146), respectively. ## What's the purpose of a cache? As this is a purpose-built implementation, I wanted to take a moment to specifically call out the problem this block seeks to provide a solution for. @@ -16,7 +16,7 @@ A cache should temporarily store some data (typically something not too large) t Again, this is the first of many more targeted proposals, and as such bears some of the greatest resemblance to what we already have, but it does differ in subtle ways at the API level (addressing issues raised in the past at the SDK level) and targets a narrower set of valid providers. It makes sense to me that the existing state store be rebranded as a general purpose store that's still more than valid for broad generalized state management support, but that this block be positioned as the first of many specialized and more specifically built solutions for common scenarios. -Here, we're seeking to target an ever-so-slightly diferent piece of the key/value store pie in that we're targeting those developer who expect to have small chunks of data expire over time. They know the keys they're intending to engage with they have specific ideas around the expiration strategies they want to exploint, meaning that this block will not support the [list operations](https://github.com/dapr/proposals/pull/61) offered in a separate proposal. +Here, we're seeking to target an ever-so-slightly different piece of the key/value store pie in that we're targeting those developers who expect to have small chunks of data expire over time. They know the keys they're intending to engage with they have specific ideas around the expiration strategies they want to exploit, meaning that this block will not support the [list operations](https://github.com/dapr/proposals/pull/61) offered in a separate proposal. Caches are often designed for rapid retrieval and as such are often hosted in-memory, meaning that developers aren't expecting to store large values alongside their keys. As such, this proposal doesn't support streaming operations but will instead set and return whole values and the documentation should call out recommendations as to the maximum size of values being stored. @@ -24,6 +24,9 @@ Similarly, there's no need for ETag support or other consistency operations. Whi So what's left? Again, a narrowly defined state store that is compatible with the Dapr resiliency capabilities, but which offers the ability to get and set values for known keys supporting both sliding and absolute expiration, and precious little more. +## Could it be argued that this isn't dramatically different enough from today's state store to justify the maintenance today? +Sure. But my counter argument would be that it would have to be accommodated at some point and would be a dramatically lower maintenance burden than the current state store. Especially as the number of these specialty stores grows over time, I expect the absence of this relatively simple variation would increasingly grow inconspicuous. And besides, it makes for a solid and well-understood introduction into the purpose-focused arena for the next generation of state stores. Again, there can be an emphasis that if the current state store is working for you, there's no need to mix things up just because, but that if you need a version that supports both a sliding and absolute expiration (TTL) for small values, this specialized use-case is for you. + ## Component YAML This component is expected to have similar attributes to existing state stores with some variation: @@ -33,7 +36,7 @@ kind: Component metadata: name: cache spec: - type: centralizedcache. + type: state.centralizedcache. version: v1 metadata: # Various properties necessary for component registration @@ -41,11 +44,11 @@ spec: ## Implementation Overview There are a few guiding principles I've stuck to while thinking through the shape of this proposal (iterated on in [this issue](https://github.com/dapr/dapr/issues/7886) to date): -- All SDK operations should be implemented asynchronously in a way that minimize operations with the sidecar. -- The key should be represented as a `string`, but the value should be persisted as a `byte[]` as it's not going to be separately queried and would be retrieved in full. This also leaves as an exercise to the developer or SDK precisely how serialization is handled (e.g. serialization strategy, formatting, encryption, compression, encoding, etc.). Ideally, ths various SDK maintainers can develop a uniform approach to this (a POC exists in the .NET SDK [here](https://github.com/dapr/dotnet-sdk/pull/1378)) so that data encoded in C# is easily decoded from any other language SDKs. +- All SDK operations should be implemented asynchronously in a way that minimizes operations with the sidecar. +- The key should be represented as a `string`, but the value should be persisted as a `byte[]` as it's not going to be separately queried and would be retrieved in full. This also leaves as an exercise to the developer or SDK precisely how serialization is handled (e.g. serialization strategy, formatting, encryption, compression, encoding, etc.). Ideally, the various SDK maintainers can develop a uniform approach to this (a POC exists in the .NET SDK [here](https://github.com/dapr/dotnet-sdk/pull/1378)) so that data encoded in C# is easily decoded from any other language SDKs. - No additional metadata should be stored alongside the value (e.g. minimize multiple operations), but should instead be bundled with the payload if necessary. This store is simply an abstraction for storing keys with values and facilitating storage and retrieval - nothing more.. from the SDK side. -That said, from an implementation perspective, I leave it as an open question in this proposal whether we want to also persist the sliding timespan in an adjacent key (e.g. `-sliding`) so accommodate sliding expirations where it might not be implemented in the underlying provider. Further, while my overarching goal is to minimize having functionality implemented by Dapr and instead heavily rely on the native providers capabilities alone, the distributed scheduler could provide this sliding and absolute TTL support where underlying providers don't have such support, broadening the number of providers that could be used with our cache abstraction, so there's some minimal wiggle room on that. That said, such variations should be minimal and undertaken only where the support coincides with the goals of this block and not just to add more functionality just because, as that doesn't align with the purpose-driven philosophy. +That said, from an implementation perspective, I leave it as an open question in this proposal whether we want to also persist the sliding timespan in an adjacent key (e.g. `-sliding`) to accommodate sliding expirations where it might not be implemented in the underlying provider. Further, while my overarching goal is to minimize having functionality implemented by Dapr and instead heavily rely on the native providers capabilities alone, the distributed scheduler could provide this sliding and absolute TTL support where underlying providers don't have such support, broadening the number of providers that could be used with our cache abstraction, so there's some minimal wiggle room on that. That said, such variations should be minimal and undertaken only where the support coincides with the goals of this block and not just to add more functionality just because, as that doesn't align with the purpose-driven philosophy. ### Expiration Strategy The expiry should be expressed in [ISO 8601 duration format](https://en.wikipedia.org/wiki/ISO_8601#Durations) and will reflect both/either a sliding and/or an absolute expiration as desired. A sliding expiration puts a bounds on the inactivity of any given state before it's removed, but resets this expiration value upon access (e.g. a value is set at noon with a sliding expiration of 2 hours meaning it expires at 2:00 PM. At 1:00 PM, the value is accessed, meaning that the sliding expiration is reset once again to 2 hours later, expiring now at 3:00 PM). An absolute expiration indicates the optional maximum period a value should be cached and overrides any sliding expiration value, when set. For example here, if the absolute expiration is specified at an hour, the value will be removed and not returned even if the sliding expiration would not have yet elapsed.