diff --git a/.gitignore b/.gitignore index 4313a01e..99a53648 100644 --- a/.gitignore +++ b/.gitignore @@ -22,6 +22,8 @@ examples/python-processor/build/** examples/python-processor/dependencies/** python/functionstream-client/src/fs_client/_proto/ python/functionstream-api/build +python/functionstream-api-advanced/build + # python Runtime - Build artifacts and intermediate files diff --git a/Makefile b/Makefile index 9472c787..4daf185b 100644 --- a/Makefile +++ b/Makefile @@ -82,7 +82,7 @@ build-lite: .check-env .build-wasm: $(call log,WASM,Building Python Runtime using $(PYTHON_EXEC)) @cd $(PYTHON_ROOT)/functionstream-runtime && \ - PYTHONPATH=../functionstream-api ../../$(PYTHON_EXEC) build.py > /dev/null + PYTHONPATH=../functionstream-api:../functionstream-api-advanced ../../$(PYTHON_EXEC) build.py > /dev/null @[ -f "$(WASM_SOURCE)" ] || (printf "$(C_R)[X] WASM Build Failed$(C_0)\n" && exit 1) dist: build diff --git a/README-zh.md b/README-zh.md index 2f444484..b1d68eac 100644 --- a/README-zh.md +++ b/README-zh.md @@ -206,7 +206,8 @@ function-stream-/ | [Function 任务配置规范](docs/function-configuration-zh.md) | 任务定义规范 | | [SQL CLI 交互式管理指南](docs/sql-cli-guide-zh.md) | 交互式管理指南 | | [Function 管理与开发指南](docs/function-development-zh.md) | 管理与开发指南 | -| [Python SDK 开发与交互指南](docs/python-sdk-guide-zh.md) | Python SDK 指南 | +| [Go SDK 开发与交互指南](docs/Go-SDK/go-sdk-guide-zh.md) | Go SDK 指南 | +| [Python SDK 开发与交互指南](docs/Python-SDK/python-sdk-guide-zh.md) | Python SDK 指南 | ## 配置 diff --git a/README.md b/README.md index ed08e9fb..51a69de1 100644 --- a/README.md +++ b/README.md @@ -205,7 +205,8 @@ We provide a robust shell script to manage the server process, capable of handli | [Function Configuration](docs/function-configuration.md) | Task Definition Specification | | [SQL CLI Guide](docs/sql-cli-guide.md) | Interactive Management Guide | | [Function Development](docs/function-development.md) | Management & Development Guide | -| [Python SDK Guide](docs/python-sdk-guide.md) | Python SDK Guide | +| [Go SDK Guide](docs/Go-SDK/go-sdk-guide.md) | Go SDK Guide | +| [Python SDK Guide](docs/Python-SDK/python-sdk-guide.md) | Python SDK Guide | ## Configuration diff --git a/docs/Go-SDK/go-sdk-advanced-state-api-zh.md b/docs/Go-SDK/go-sdk-advanced-state-api-zh.md new file mode 100644 index 00000000..9abfff92 --- /dev/null +++ b/docs/Go-SDK/go-sdk-advanced-state-api-zh.md @@ -0,0 +1,321 @@ + + +# Go SDK — 高级状态 API + +本文档介绍 Function Stream Go SDK 的**带类型高级状态 API**:在底层 `Store` 之上提供的状态抽象(ValueState、ListState、MapState、PriorityQueueState、AggregatingState、ReducingState 及 Keyed* 工厂),通过 **codec** 序列化,并支持按主键的 **keyed state**。当需要结构化状态而不想手写字节编码或 key 布局时使用。 + +**文档结构:** + +- [何时使用哪种 API](#1-何时使用哪种-api) — 按使用场景选择状态类型。 +- [包与导入](#2-包与导入) — 类型与 codec 所在包。 +- [Codec 约定](#3-codec-约定与默认-codec) — 编码、解码及有序性要求。 +- [创建状态](#4-创建状态带-codec-与-autocodec) — 显式 codec 与 AutoCodec 构造方式。 +- [非 Keyed 状态参考](#5-非-keyed-状态structures) — 方法与构造方法一览。 +- [AggregateFunc 与 ReduceFunc](#6-aggregatefunc-与-reducefunc) — 聚合与归约接口。 +- [Keyed 状态](#7-keyed-状态工厂与按-key-实例) — keyGroup、primaryKey、namespace 与工厂方法。 +- [错误处理与最佳实践](#8-错误处理与最佳实践) — 生产环境建议。 +- [示例](#9-示例) — ValueState、Keyed list、MapState、AggregatingState。 + +--- + +## 1. 何时使用哪种 API + +高级状态 API 对单个逻辑 store 提供带类型视图。根据访问模式选择对应抽象: + +| 使用场景 | 推荐 API | 说明 | +|-------------------------|-------------------------------|----------------------------------------------| +| 单一逻辑值(计数、配置块、最新值) | **ValueState[T]** | 每个 store 一个值;更新即覆盖。 | +| 仅追加序列(事件日志、历史) | **ListState[T]** | 批量添加、整体读/替换;无 key 迭代。 | +| 需范围/迭代的键值映射 | **MapState[K,V]** | 键类型**必须**有有序 codec(如基本类型)。 | +| 优先队列(最小/最大、Top-K) | **PriorityQueueState[T]** | 元素类型**必须**有有序 codec。 | +| 运行中聚合(sum、count、自定义累加器) | **AggregatingState[T,ACC,R]** | 使用 `AggregateFunc`;累加器可合并。 | +| 运行中归约(二元合并) | **ReducingState[V]** | 使用 `ReduceFunc`;满足结合律的合并。 | +| **按 key** 的状态(按用户、按分区) | **Keyed*** 工厂 | 用于 **keyed 算子**;工厂 + 每条记录的 primaryKey。 | +| 自定义 key 布局、批量扫描、非类型化存储 | 底层 **Store** | `Put`/`Get`/`ScanComplex`/`ComplexKey`;完全自控。 | + +**Keyed 与非 Keyed** + +- **Keyed 状态**用于 **keyed 算子**:流按 key 分区(如 keyBy 之后)。运行时按 key 投递记录;每个 key 应有独立状态。可**一次**获取**工厂**(从 context、store 名称、keyGroup),再按**主键**(流 key)与 namespace 构造对应状态类型。 +- **非 Keyed 状态**(ValueState、ListState 等)每个 store 存一个逻辑实体。在无 key 分区或维护单一全局状态时使用。 + +--- + +## 2. 包与导入 + +**两个独立库:** 低阶 **go-sdk**,高阶 **go-sdk-advanced**(依赖 go-sdk)。 + +| 包 | 导入路径 | 职责 | +|------------------------|---------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------| +| **codec**(高阶) | `github.com/functionstream/function-stream/go-sdk-advanced/codec` | `Codec[T]` 接口及内置 codec。 | +| **structures**(高阶) | `github.com/functionstream/function-stream/go-sdk-advanced/structures` | ValueState、ListState、MapState、PriorityQueueState、AggregatingState、ReducingState。 | +| **keyed**(高阶) | `github.com/functionstream/function-stream/go-sdk-advanced/keyed` | Keyed 状态工厂及按 key 的类型(KeyedListStateFactory、KeyedListState 等)。在 keyed 算子中使用。 | + +所有状态构造方法均接收 `api.Context`(即 `fssdk.Context`)和 **store 名称**。Store 内部通过 `ctx.GetOrCreateStore(storeName)` 获取。同一 store 名称始终对应同一底层 store(默认实现为 RocksDB)。 + +--- + +## 3. Codec 约定与默认 Codec + +### 3.1 Codec 接口 + +`codec.Codec[T]`: + +| 方法 | 说明 | +|-----------------------------------|-------------------------------------------------------------------------------------| +| `Encode(value T) ([]byte, error)` | 将值序列化为字节。 | +| `Decode(data []byte) (T, error)` | 从字节反序列化。 | +| `EncodedSize() int` | 固定大小时返回 `> 0`;变长时为 `<= 0`(用于 list 优化)。 | +| `IsOrderedKeyCodec() bool` | 为 `true` 时,字节编码**全序**:字节字典序与值的顺序一致。**MapState 的 key 与 PriorityQueueState 的元素必须满足**。 | + +### 3.2 DefaultCodecFor[T]() + +`codec.DefaultCodecFor[T]()` 返回类型 `T` 的默认 codec: + +- **基本类型**(`int32`、`int64`、`uint32`、`uint64`、`float32`、`float64`、`string`、`bool`、`int`、`uint` 等):内置 codec;作为 map key 或 PQ 元素使用时**有序**。 +- **结构体、map、slice、数组**:`JSONCodec[T]` — JSON 编码;**无序**(`IsOrderedKeyCodec() == false`)。**不要**在 MapState key 或 PriorityQueueState 元素类型上使用 AutoCodec(依赖有序性的操作可能失败或 panic)。 +- **无约束的接口类型**:返回错误;类型参数须为具体类型。 + +### 3.3 有序性要求 + +**MapState[K,V]** 与 **PriorityQueueState[T]** 的 key(或元素)类型必须使用 `IsOrderedKeyCodec() == true` 的 codec。对 Map 或 PQ 使用 **AutoCodec** 构造时,请使用基本类型的 key/元素(如 `int64`、`string`),或提供显式有序 codec。 + +--- + +## 4. 创建状态:带 Codec 与 AutoCodec + +两种构造方式: + +1. **显式 codec** — `NewXxxFromContext(ctx, storeName, codec, ...)` + 由你提供 `Codec[T]`(Map 需 key + value codec;Aggregating/Reducing 需 acc/value codec 及函数)。可完全控制编码与有序性。 + +2. **AutoCodec** — `NewXxxFromContextAutoCodec(ctx, storeName)` 或 `(ctx, storeName, aggFunc/reduceFunc)` + SDK 使用 `codec.DefaultCodecFor[T]()` 作为 value/累加器类型。Map 与 PQ 的 key/元素类型须有**有序**默认 codec;否则创建或操作可能返回 `ErrStoreInternal`。 + +状态实例是**轻量**的。可在每次调用(如 `Process` 内)创建,或在 Driver 中(如 `Init`)缓存。同一 store 名称始终对应同一底层 store;仅类型视图不同。 + +--- + +## 5. 非 Keyed 状态(structures) + +### 5.1 语义与方法 + +| 状态 | 语义 | 主要方法 | 有序 codec? | +|-------------------------------|---------------------|------------------------------------------------------------------------------------------------------------------------|-----------------| +| **ValueState[T]** | 单一可替换值。 | `Update(value T) error`;`Value() (T, bool, error)`;`Clear() error` | 否 | +| **ListState[T]** | 仅追加列表;批量添加与整体替换。 | `Add(value T) error`;`AddAll(values []T) error`;`Get() ([]T, error)`;`Update(values []T) error`;`Clear() error` | 否 | +| **MapState[K,V]** | 键值映射;通过 `All()` 迭代。 | `Put(key K, value V) error`;`Get(key K) (V, bool, error)`;`Delete(key K) error`;`Clear() error`;`All() iter.Seq2[K,V]` | **Key K:是** | +| **PriorityQueueState[T]** | 优先队列(按编码顺序最小优先)。 | `Add(value T) error`;`Peek() (T, bool, error)`;`Poll() (T, bool, error)`;`Clear() error`;`All() iter.Seq[T]` | **元素 T:是** | +| **AggregatingState[T,ACC,R]** | 可合并累加器的运行中聚合。 | `Add(value T) error`;`Get() (R, bool, error)`;`Clear() error` | 否(ACC codec 任意) | +| **ReducingState[V]** | 二元合并的运行中归约。 | `Add(value V) error`;`Get() (V, bool, error)`;`Clear() error` | 否 | + +### 5.2 构造方法一览(非 Keyed) + +| 状态 | 带 codec | AutoCodec | +|---------------------------|-----------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------| +| ValueState[T] | `NewValueStateFromContext(ctx, storeName, valueCodec)` | `NewValueStateFromContextAutoCodec[T](ctx, storeName)` | +| ListState[T] | `NewListStateFromContext(ctx, storeName, itemCodec)` | `NewListStateFromContextAutoCodec[T](ctx, storeName)` | +| MapState[K,V] | `NewMapStateFromContext(ctx, storeName, keyCodec, valueCodec)` 或 `NewMapStateAutoKeyCodecFromContext(ctx, storeName, valueCodec)` | `NewMapStateFromContextAutoCodec[K,V](ctx, storeName)` | +| PriorityQueueState[T] | `NewPriorityQueueStateFromContext(ctx, storeName, itemCodec)` | `NewPriorityQueueStateFromContextAutoCodec[T](ctx, storeName)` | +| AggregatingState[T,ACC,R] | `NewAggregatingStateFromContext(ctx, storeName, accCodec, aggFunc)` | `NewAggregatingStateFromContextAutoCodec(ctx, storeName, aggFunc)` | +| ReducingState[V] | `NewReducingStateFromContext(ctx, storeName, valueCodec, reduceFunc)` | `NewReducingStateFromContextAutoCodec[V](ctx, storeName, reduceFunc)` | + +--- + +## 6. AggregateFunc 与 ReduceFunc + +### 6.1 AggregateFunc[T, ACC, R] + +**AggregatingState** 需要实现 **AggregateFunc[T, ACC, R]**(位于包 `structures`): + +| 方法 | 说明 | +|-------------------------------------|-------------------------------| +| `CreateAccumulator() ACC` | 空状态时的初始累加器。 | +| `Add(value T, accumulator ACC) ACC` | 将一条输入折叠进累加器。 | +| `GetResult(accumulator ACC) R` | 从累加器得到最终结果。 | +| `Merge(a, b ACC) ACC` | 合并两个累加器(如分布式或 checkpoint 合并)。 | + +### 6.2 ReduceFunc[V] + +**ReducingState** 需要 **ReduceFunc[V]**(函数类型):`func(value1, value2 V) (V, error)`。须满足**结合律**(最好满足交换律),使多次应用得到确定的归约结果。 + +--- + +## 7. Keyed 状态 — 工厂与按 Key 实例 + +Keyed 状态用于 **keyed 算子**:流按 key 分区(如 keyBy)时,每个 key 在独立状态上处理。可**一次**获取**工厂**(从 context、store 名称与 **keyGroup**),再按**主键**(当前记录的流 key)与 namespace 构造对应状态类型。 + +状态按 **keyGroup**([]byte)和 **主键**(primaryKey,[]byte)组织。由 context、store 名称、keyGroup 创建工厂;再通过工厂方法按主键获取状态。 + +### 7.1 keyGroup、key(主键)与 namespace + +Keyed API 对应 store 的 **ComplexKey**,有三个维度: + +| 术语 | 出现位置 | 含义 | +|---------------|-----------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------| +| **keyGroup** | 创建工厂时的参数 | **keyed 组**:标识该状态所属分区/组(如 `[]byte("counters")`、`[]byte("sessions")`)。同一 keyed 组 ⇒ 相同 keyGroup 字节。 | +| **key** | 工厂方法中的 `primaryKey`(如 `NewKeyedList(primaryKey, namespace)` 等) | **流 key 的值**:分区流所用的 key,序列化为字节(如用户 ID、分区 key)。不同 primaryKey 对应不同状态。 | +| **namespace** | 工厂方法中的 `namespace`([]byte) | **有窗口时**:**窗口标识的字节**(如序列化的窗口边界或窗口 ID),状态按 key 与窗口隔离。**无窗口时**:传**空字节**(`nil` 或 `[]byte{}`)。 | + +**小结**:**keyGroup** = keyed 组标识;**key**(primaryKey)= 流 key 值;**namespace** = 使用窗口时为窗口字节,否则为空。 + +### 7.2 工厂构造方法一览(Keyed) + +| 工厂 | 带 codec | AutoCodec | +|---------------------------------------|---------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------| +| KeyedValueStateFactory[V] | `NewKeyedValueStateFactoryFromContext(ctx, storeName, keyGroup, valueCodec)` | `NewKeyedValueStateFactoryFromContextAutoCodec[V](ctx, storeName, keyGroup)` | +| KeyedListStateFactory[V] | `NewKeyedListStateFactoryFromContext(ctx, storeName, keyGroup, valueCodec)` | `NewKeyedListStateFactoryAutoCodecFromContext[V](ctx, storeName, keyGroup)` | +| KeyedMapStateFactory[MK,MV] | `NewKeyedMapStateFactoryFromContext(ctx, storeName, keyGroup, keyCodec, valueCodec)` | `NewKeyedMapStateFactoryFromContextAutoCodec[MK,MV](ctx, storeName, keyGroup)` | +| KeyedPriorityQueueStateFactory[V] | `NewKeyedPriorityQueueStateFactoryFromContext(ctx, storeName, keyGroup, itemCodec)` | `NewKeyedPriorityQueueStateFactoryFromContextAutoCodec[V](ctx, storeName, keyGroup)` | +| KeyedAggregatingStateFactory[T,ACC,R] | `NewKeyedAggregatingStateFactoryFromContext(ctx, storeName, keyGroup, accCodec, aggFunc)` | `NewKeyedAggregatingStateFactoryFromContextAutoCodec(ctx, storeName, keyGroup, aggFunc)` | +| KeyedReducingStateFactory[V] | `NewKeyedReducingStateFactoryFromContext(ctx, storeName, keyGroup, valueCodec, reduceFunc)` | `NewKeyedReducingStateFactoryFromContextAutoCodec[V](ctx, storeName, keyGroup, reduceFunc)` | + +### 7.3 从工厂获取按 Key 状态 + +| 工厂 | 方法 | 返回 | +|-----------------------------------|-----------------------------------------------------------------------------------------------------|-----------------------------------------| +| KeyedValueStateFactory[V] | `NewKeyedValue(primaryKey []byte, namespace []byte) (*KeyedValueState[V], error)` | 每个 (primaryKey, namespace) 一个 value 状态。 | +| KeyedListStateFactory[V] | `NewKeyedList(primaryKey []byte, namespace []byte) (*KeyedListState[V], error)` | 每个 (primaryKey, namespace) 一个 list 状态。 | +| KeyedMapStateFactory[MK,MV] | `NewKeyedMap(primaryKey []byte, mapName string) (*KeyedMapState[MK,MV], error)` | 每个 (primaryKey, mapName) 一个 map 状态。 | +| KeyedPriorityQueueStateFactory[V] | `NewKeyedPriorityQueue(primaryKey []byte, namespace []byte) (*KeyedPriorityQueueState[V], error)` | 每个 (primaryKey, namespace) 一个 PQ 状态。 | +| KeyedAggregatingStateFactory | `NewAggregatingState(primaryKey []byte, stateName string) (*KeyedAggregatingState[T,ACC,R], error)` | 每个 (primaryKey, stateName) 一个聚合状态。 | +| KeyedReducingStateFactory[V] | `NewReducingState(primaryKey []byte, namespace []byte) (*KeyedReducingState[V], error)` | 每个 (primaryKey, namespace) 一个归约状态。 | + +此处 **primaryKey** 为流 key 值;**namespace** 在使用窗口函数时为窗口字节,否则为空。 + +**设计建议**:每个逻辑状态使用稳定的 keyGroup(如 `[]byte("orders")`)。在工厂方法中,将 keyed 算子收到的**流 key**(如从 key 提取器或消息元数据)作为 primaryKey 传入。 + +--- + +## 8. 错误处理与最佳实践 + +- **状态 API 错误**:创建与方法返回的错误与 `fssdk.SDKError` 兼容(如 `ErrStoreInternal`、`ErrStoreIO`)。Codec 编解码失败会被包装(如 `"encode value state failed"`)。生产环境务必检查并处理错误。 +- **Store 命名**:每个逻辑状态使用稳定、唯一的 store 名称(如 `"counters"`、`"user-sessions"`)。同一运行时中同一名称对应同一 store。 +- **状态缓存**:可在 `Init` 中创建一次状态实例并在 `Process` 中复用,也可每条消息创建。按消息创建是安全的,在不需要分摊创建成本时能保持代码简单。 +- **KeyGroup 设计**:Keyed 状态中,每个“逻辑表”使用一致的 keyGroup。primaryKey 在 keyed 算子中为**流 key** — 使用标识当前记录的 key。使用**窗口函数**时,将窗口标识作为 **namespace** 传入,使状态按 key 与窗口隔离。 +- **有序 codec**:MapState 与 PriorityQueueState 使用 AutoCodec 时,请用基本类型作为 key/元素。自定义结构体 key 需实现 `IsOrderedKeyCodec() == true` 的 `Codec[K]` 并使用“带 codec”的构造方法。 + +--- + +## 9. 示例 + +### 9.1 ValueState + AutoCodec(计数器) + +```go +import ( + fssdk "github.com/functionstream/function-stream/go-sdk" + "github.com/functionstream/function-stream/go-sdk-advanced/structures" +) + +func (p *MyProcessor) Process(ctx fssdk.Context, sourceID uint32, data []byte) error { + valState, err := structures.NewValueStateFromContextAutoCodec[int64](ctx, "my-store") + if err != nil { + return err + } + cur, _, _ := valState.Value() + if err := valState.Update(cur + 1); err != nil { + return err + } + // emit 或继续... + return nil +} +``` + +### 9.2 Keyed list 工厂(keyed 算子) + +当算子在 **keyed 流**上运行时,使用 Keyed list 工厂,并为每条消息将**流 key** 作为 primaryKey 传入: + +```go +import ( + fssdk "github.com/functionstream/function-stream/go-sdk" + "github.com/functionstream/function-stream/go-sdk-advanced/keyed" +) + +type Order struct { Id string; Amount int64 } + +func (p *MyProcessor) Init(ctx fssdk.Context, config map[string]string) error { + keyGroup := []byte("orders") + factory, err := keyed.NewKeyedListStateFactoryAutoCodecFromContext[Order](ctx, "app-store", keyGroup) + if err != nil { + return err + } + p.listFactory = factory + return nil +} + +func (p *MyProcessor) Process(ctx fssdk.Context, sourceID uint32, data []byte) error { + userID := parseUserID(data) // []byte — 当前记录的流 key + list, err := p.listFactory.NewKeyedList(userID, []byte{}) // 无窗口时 namespace 为空 + if err != nil { + return err + } + if err := list.Add(Order{Id: "1", Amount: 100}); err != nil { + return err + } + items, err := list.Get() + if err != nil { + return err + } + // 使用 items... + return nil +} +``` + +### 9.3 MapState 与 AggregatingState(求和) + +使用 **go-sdk-advanced** 的 `structures` 包;`ctx` 来自低阶 go-sdk 的 `fssdk.Context`。 + +```go +import ( + fssdk "github.com/functionstream/function-stream/go-sdk" + "github.com/functionstream/function-stream/go-sdk-advanced/structures" +) + +// MapState: string -> int64(两者均有有序默认 codec) +m, err := structures.NewMapStateFromContextAutoCodec[string, int64](ctx, "counts") +if err != nil { + return err +} +_ = m.Put("a", 1) +v, ok, _ := m.Get("a") + +// AggregatingState: int64 求和(ACC = int64, R = int64) +type sumAgg struct{} +func (sumAgg) CreateAccumulator() int64 { return 0 } +func (sumAgg) Add(v int64, acc int64) int64 { return acc + v } +func (sumAgg) GetResult(acc int64) int64 { return acc } +func (sumAgg) Merge(a, b int64) int64 { return a + b } + +agg, err := structures.NewAggregatingStateFromContextAutoCodec[int64, int64, int64](ctx, "sum-store", sumAgg{}) +if err != nil { + return err +} +_ = agg.Add(10) +total, _, _ := agg.Get() +``` + +--- + +## 10. 参见 + +- [Go SDK 指南](go-sdk-guide-zh.md) — 主文档:Driver、Context、Store、构建与部署。 +- [Python SDK — 高级状态 API](../Python-SDK/python-sdk-advanced-state-api-zh.md) — Python SDK 的等价带类型状态 API。 +- [examples/go-processor/README.md](../../examples/go-processor/README.md) — 示例算子与构建说明。 diff --git a/docs/Go-SDK/go-sdk-advanced-state-api.md b/docs/Go-SDK/go-sdk-advanced-state-api.md new file mode 100644 index 00000000..4e940a15 --- /dev/null +++ b/docs/Go-SDK/go-sdk-advanced-state-api.md @@ -0,0 +1,322 @@ + + +# Go SDK — Advanced State API + +This document describes the **typed, high-level state API** for the Function Stream Go SDK: state abstractions (ValueState, ListState, MapState, PriorityQueueState, AggregatingState, ReducingState, and Keyed\* factories) built on top of the low-level `Store`, with serialization via **codecs** and optional **keyed state** per primary key. Use it when you need structured state without manual byte encoding or key layout. + +**In this document:** + +- [When to use which API](#1-when-to-use-which-api) — choose the right state type for your use case. +- [Packages and imports](#2-packages-and-imports) — where to find types and codecs. +- [Codec contract](#3-codec-contract-and-default-codecs) — encoding, decoding, and ordering requirements. +- [Creating state](#4-creating-state-with-codec-vs-autocodec) — explicit codec vs AutoCodec constructors. +- [Non-keyed state reference](#5-non-keyed-state-structures) — methods and constructor summary. +- [AggregateFunc and ReduceFunc](#6-aggregatefunc-and-reducefunc) — interfaces for aggregation and reduction. +- [Keyed state](#7-keyed-state-factories-and-per-key-instances) — keyGroup, primaryKey, namespace, and factory methods. +- [Error handling and best practices](#8-error-handling-and-best-practices) — production guidance. +- [Examples](#9-examples) — ValueState, Keyed list, MapState, and AggregatingState. + +--- + +## 1. When to Use Which API + +The advanced state API offers typed views over a single logical store. Pick the abstraction that matches your access pattern: + +| Use case | Recommended API | Notes | +|---------------------------------------------------------|-------------------------------|------------------------------------------------------------| +| Single logical value (counter, config blob, last value) | **ValueState[T]** | One value per store; replace on update. | +| Append-only sequence (event log, history) | **ListState[T]** | Batch add, full read/replace; no key iteration. | +| Key-value map with range/iteration | **MapState[K,V]** | Key type **must** have an ordered codec (e.g. primitives). | +| Priority queue (min/max, top-K) | **PriorityQueueState[T]** | Element type **must** have an ordered codec. | +| Running aggregate (sum, count, custom accumulator) | **AggregatingState[T,ACC,R]** | Uses `AggregateFunc`; mergeable accumulators. | +| Running reduce (binary combine) | **ReducingState[V]** | Uses `ReduceFunc`; associative combine. | +| **Per-key** state (per user, per partition) | **Keyed\*** factories | For **keyed operators**; factory + primaryKey per record. | +| Custom key layout, bulk scan, non-typed storage | Low-level **Store** | `Put`/`Get`/`ScanComplex`/`ComplexKey`; full control. | + +**Keyed vs non-keyed** + +- **Keyed state** is for **keyed operators**: streams partitioned by a key (e.g. after keyBy). The runtime delivers records per key; each key should have isolated state. Obtain a **factory** once (from context, store name, and keyGroup), then construct the corresponding state type per **primary key** (stream key) and namespace. +- **Non-keyed state** (ValueState, ListState, etc.) stores one logical entity per store. Use it when there is no key partitioning or you maintain a single global state. + +--- + +## 2. Packages and Imports + +**Two separate libraries:** low-level **go-sdk**, high-level **go-sdk-advanced** (depends on go-sdk). + +| Package | Import path | Responsibility | +|-------------------|------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------| +| **codec** (advanced) | `github.com/functionstream/function-stream/go-sdk-advanced/codec` | `Codec[T]` interface and built-in codecs. | +| **structures** (advanced) | `github.com/functionstream/function-stream/go-sdk-advanced/structures` | ValueState, ListState, MapState, PriorityQueueState, AggregatingState, ReducingState. | +| **keyed** (advanced) | `github.com/functionstream/function-stream/go-sdk-advanced/keyed` | Keyed state factories and per-key types. Use in keyed operators. | + +All state constructors take `api.Context` (i.e. `fssdk.Context`) and a **store name**. The store is obtained internally via `ctx.GetOrCreateStore(storeName)`. The same store name always refers to the same backing store (RocksDB in the default implementation). + +--- + +## 3. Codec Contract and Default Codecs + +### 3.1 Codec interface + +`codec.Codec[T]`: + +| Method | Description | +|-----------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `Encode(value T) ([]byte, error)` | Serialize a value to bytes. | +| `Decode(data []byte) (T, error)` | Deserialize from bytes. | +| `EncodedSize() int` | Fixed size if `> 0`; variable size if `<= 0` (used for list optimizations). | +| `IsOrderedKeyCodec() bool` | If `true`, the byte encoding is **totally ordered**: lexicographic order of bytes corresponds to a well-defined order of values. **Required** for MapState key and PriorityQueueState element. | + +### 3.2 DefaultCodecFor[T]() + +`codec.DefaultCodecFor[T]()` returns a default codec for type `T`: + +- **Primitives** (`int32`, `int64`, `uint32`, `uint64`, `float32`, `float64`, `string`, `bool`, `int`, `uint`, etc.): built-in codecs; **ordered** when used as map keys or PQ elements. +- **Struct, map, slice, array**: `JSONCodec[T]` — JSON encoding; **not ordered** (`IsOrderedKeyCodec() == false`). Do **not** use as MapState key or PriorityQueueState element type with AutoCodec (operations that depend on ordering may fail or panic). +- **Interface type without constraint**: returns error; the type parameter must be concrete. + +### 3.3 Ordering requirement + +For **MapState[K,V]** and **PriorityQueueState[T]**, the key (respectively element) type must use a codec with `IsOrderedKeyCodec() == true`. With **AutoCodec** constructors for Map or PQ, use primitive key/element types (e.g. `int64`, `string`) or provide an explicit ordered codec. + +--- + +## 4. Creating State: With Codec vs AutoCodec + +Two constructor families: + +1. **Explicit codec** — `NewXxxFromContext(ctx, storeName, codec, ...)` + You supply a `Codec[T]` (and for Map: key + value codecs; for Aggregating/Reducing: acc/value codec plus function). Full control over encoding and ordering. + +2. **AutoCodec** — `NewXxxFromContextAutoCodec(ctx, storeName)` or `(ctx, storeName, aggFunc/reduceFunc)` + The SDK uses `codec.DefaultCodecFor[T]()` for the value/accumulator type. For Map and PQ, the key/element type must have an **ordered** default (primitives); otherwise creation or operations may return `ErrStoreInternal`. + +State instances are **lightweight**. You can create them per call (e.g. inside `Process`) or cache in the Driver (e.g. in `Init`). The same store name always refers to the same underlying store; only the typed view differs. + +--- + +## 5. Non-Keyed State (structures) + +### 5.1 Semantics and methods + +| State | Semantics | Main methods | Ordered codec? | +|-------------------------------|-------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------|--------------------| +| **ValueState[T]** | Single replaceable value. | `Update(value T) error`; `Value() (T, bool, error)`; `Clear() error` | No | +| **ListState[T]** | Append-only list; batch add and full replace. | `Add(value T) error`; `AddAll(values []T) error`; `Get() ([]T, error)`; `Update(values []T) error`; `Clear() error` | No | +| **MapState[K,V]** | Key-value map; iteration via `All()`. | `Put(key K, value V) error`; `Get(key K) (V, bool, error)`; `Delete(key K) error`; `Clear() error`; `All() iter.Seq2[K,V]` | **Key K: yes** | +| **PriorityQueueState[T]** | Priority queue (min-first by encoded order). | `Add(value T) error`; `Peek() (T, bool, error)`; `Poll() (T, bool, error)`; `Clear() error`; `All() iter.Seq[T]` | **Item T: yes** | +| **AggregatingState[T,ACC,R]** | Running aggregation with mergeable accumulator. | `Add(value T) error`; `Get() (R, bool, error)`; `Clear() error` | No (ACC codec any) | +| **ReducingState[V]** | Running reduce with binary combine. | `Add(value V) error`; `Get() (V, bool, error)`; `Clear() error` | No | + +### 5.2 Constructor summary (non-keyed) + +| State | With codec | AutoCodec | +|---------------------------|------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------| +| ValueState[T] | `NewValueStateFromContext(ctx, storeName, valueCodec)` | `NewValueStateFromContextAutoCodec[T](ctx, storeName)` | +| ListState[T] | `NewListStateFromContext(ctx, storeName, itemCodec)` | `NewListStateFromContextAutoCodec[T](ctx, storeName)` | +| MapState[K,V] | `NewMapStateFromContext(ctx, storeName, keyCodec, valueCodec)` or `NewMapStateAutoKeyCodecFromContext(ctx, storeName, valueCodec)` | `NewMapStateFromContextAutoCodec[K,V](ctx, storeName)` | +| PriorityQueueState[T] | `NewPriorityQueueStateFromContext(ctx, storeName, itemCodec)` | `NewPriorityQueueStateFromContextAutoCodec[T](ctx, storeName)` | +| AggregatingState[T,ACC,R] | `NewAggregatingStateFromContext(ctx, storeName, accCodec, aggFunc)` | `NewAggregatingStateFromContextAutoCodec(ctx, storeName, aggFunc)` | +| ReducingState[V] | `NewReducingStateFromContext(ctx, storeName, valueCodec, reduceFunc)` | `NewReducingStateFromContextAutoCodec[V](ctx, storeName, reduceFunc)` | + +--- + +## 6. AggregateFunc and ReduceFunc + +### 6.1 AggregateFunc[T, ACC, R] + +**AggregatingState** requires an **AggregateFunc[T, ACC, R]** (in package `structures`): + +| Method | Description | +|-------------------------------------|-------------------------------------------------------------------------------------| +| `CreateAccumulator() ACC` | Initial accumulator for empty state. | +| `Add(value T, accumulator ACC) ACC` | Fold one input value into the accumulator. | +| `GetResult(accumulator ACC) R` | Produce the final result from the accumulator. | +| `Merge(a, b ACC) ACC` | Combine two accumulators (e.g. for merge in distributed or checkpointed execution). | + +### 6.2 ReduceFunc[V] + +**ReducingState** requires a **ReduceFunc[V]** (function type): `func(value1, value2 V) (V, error)`. It must be **associative** (and ideally commutative) so that repeated application yields a well-defined reduced value. + +--- + +## 7. Keyed State — Factories and Per-Key Instances + +Keyed state is for **keyed operators**: when the stream is partitioned by a key (e.g. after keyBy), each key is processed with isolated state. You obtain a **factory** once (from context, store name, and **keyGroup**), then create state **per primary key** — the stream key for the current record (e.g. user ID, partition key). + +State is organized by **keyGroup** ([]byte) and **primary key** ([]byte). Create the factory from context, store name, and keyGroup; then call factory methods to get state for a given primary key. + +### 7.1 keyGroup, key (primaryKey), and namespace + +The Keyed API maps onto the store’s **ComplexKey** with three dimensions: + +| Term | Where it appears | Meaning | +|---------------|----------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| **keyGroup** | Argument when creating the factory | The **keyed group**: identifies which keyed partition/group this state belongs to. Use one keyGroup per logical “keyed group” or state kind (e.g. `[]byte("counters")`, `[]byte("sessions")`). Same keyed group ⇒ same keyGroup bytes. | +| **key** | `primaryKey` in factory methods (e.g. `NewKeyedList(primaryKey, namespace)`) | The **value of the stream key**: the key that partitioned the stream, serialized as bytes (e.g. user ID, partition key). Each distinct primaryKey gets isolated state. | +| **namespace** | `namespace` ([]byte) in factory methods that take it | **With window functions**: use the **window identifier as bytes** (e.g. serialized window bounds or window ID) so state is scoped per key *and* per window. **Without windows**: pass **empty bytes** (`nil` or `[]byte{}`). | + +**Summary:** **keyGroup** = keyed group identifier; **key** (primaryKey) = stream key value; **namespace** = window bytes when using windows, otherwise empty. + +### 7.2 Factory constructor summary (keyed) + +| Factory | With codec | AutoCodec | +|---------------------------------------|---------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------| +| KeyedValueStateFactory[V] | `NewKeyedValueStateFactoryFromContext(ctx, storeName, keyGroup, valueCodec)` | `NewKeyedValueStateFactoryFromContextAutoCodec[V](ctx, storeName, keyGroup)` | +| KeyedListStateFactory[V] | `NewKeyedListStateFactoryFromContext(ctx, storeName, keyGroup, valueCodec)` | `NewKeyedListStateFactoryAutoCodecFromContext[V](ctx, storeName, keyGroup)` | +| KeyedMapStateFactory[MK,MV] | `NewKeyedMapStateFactoryFromContext(ctx, storeName, keyGroup, keyCodec, valueCodec)` | `NewKeyedMapStateFactoryFromContextAutoCodec[MK,MV](ctx, storeName, keyGroup)` | +| KeyedPriorityQueueStateFactory[V] | `NewKeyedPriorityQueueStateFactoryFromContext(ctx, storeName, keyGroup, itemCodec)` | `NewKeyedPriorityQueueStateFactoryFromContextAutoCodec[V](ctx, storeName, keyGroup)` | +| KeyedAggregatingStateFactory[T,ACC,R] | `NewKeyedAggregatingStateFactoryFromContext(ctx, storeName, keyGroup, accCodec, aggFunc)` | `NewKeyedAggregatingStateFactoryFromContextAutoCodec(ctx, storeName, keyGroup, aggFunc)` | +| KeyedReducingStateFactory[V] | `NewKeyedReducingStateFactoryFromContext(ctx, storeName, keyGroup, valueCodec, reduceFunc)` | `NewKeyedReducingStateFactoryFromContextAutoCodec[V](ctx, storeName, keyGroup, reduceFunc)` | + +### 7.3 Obtaining per-key state from a factory + +| Factory | Method | Returns | +|-----------------------------------|-----------------------------------------------------------------------------------------------------|------------------------------------------------| +| KeyedValueStateFactory[V] | `NewKeyedValue(primaryKey []byte, namespace []byte) (*KeyedValueState[V], error)` | One value state per (primaryKey, namespace). | +| KeyedListStateFactory[V] | `NewKeyedList(primaryKey []byte, namespace []byte) (*KeyedListState[V], error)` | List state per (primaryKey, namespace). | +| KeyedMapStateFactory[MK,MV] | `NewKeyedMap(primaryKey []byte, mapName string) (*KeyedMapState[MK,MV], error)` | Map state per (primaryKey, mapName). | +| KeyedPriorityQueueStateFactory[V] | `NewKeyedPriorityQueue(primaryKey []byte, namespace []byte) (*KeyedPriorityQueueState[V], error)` | PQ state per (primaryKey, namespace). | +| KeyedAggregatingStateFactory | `NewAggregatingState(primaryKey []byte, stateName string) (*KeyedAggregatingState[T,ACC,R], error)` | Aggregating state per (primaryKey, stateName). | +| KeyedReducingStateFactory[V] | `NewReducingState(primaryKey []byte, namespace []byte) (*KeyedReducingState[V], error)` | Reducing state per (primaryKey, namespace). | + +Here **primaryKey** is the stream key value; **namespace** is the window bytes when using window functions, or empty when not. + +**Design tip:** Use a stable keyGroup per logical state (e.g. `[]byte("orders")`). In factory methods, pass the **stream key** your keyed operator received (e.g. from key extractor or message metadata) as primaryKey. + +--- + +## 8. Error Handling and Best Practices + +- **State API errors**: Creation and methods return errors compatible with `fssdk.SDKError` (e.g. `ErrStoreInternal`, `ErrStoreIO`). Codec encode/decode failures are wrapped (e.g. `"encode value state failed"`). Always check and handle errors in production. +- **Store naming**: Use stable, unique store names per logical state (e.g. `"counters"`, `"user-sessions"`). The same name in the same runtime refers to the same store. +- **Caching state**: You can create a state instance once in `Init` and reuse it in `Process`, or create it per message. Per-message creation is safe and keeps code simple when you do not need to amortize creation cost. +- **KeyGroup design**: For keyed state, use a consistent keyGroup per “logical table”. primaryKey is the **stream key** in keyed operators — use the key that identifies the current record. With **window functions**, pass the window identifier as **namespace** so state is per key and per window. +- **Ordered codec**: For MapState and PriorityQueueState with AutoCodec, use primitive key/element types. For custom struct keys, implement a `Codec[K]` with `IsOrderedKeyCodec() == true` and use the “with codec” constructor. + +--- + +## 9. Examples + +### 9.1 ValueState with AutoCodec (counter) + +```go +import ( + fssdk "github.com/functionstream/function-stream/go-sdk" + "github.com/functionstream/function-stream/go-sdk-advanced/structures" +) + +func (p *MyProcessor) Process(ctx fssdk.Context, sourceID uint32, data []byte) error { + valState, err := structures.NewValueStateFromContextAutoCodec[int64](ctx, "my-store") + if err != nil { + return err + } + cur, _, _ := valState.Value() + if err := valState.Update(cur + 1); err != nil { + return err + } + // emit or continue... + return nil +} +``` + +### 9.2 Keyed list factory (keyed operator) + +When the operator runs on a **keyed stream**, use a Keyed list factory and pass the **stream key** as primaryKey for each message: + +```go +import ( + fssdk "github.com/functionstream/function-stream/go-sdk" + "github.com/functionstream/function-stream/go-sdk-advanced/keyed" +) + +type Order struct { Id string; Amount int64 } + +func (p *MyProcessor) Init(ctx fssdk.Context, config map[string]string) error { + keyGroup := []byte("orders") + factory, err := keyed.NewKeyedListStateFactoryAutoCodecFromContext[Order](ctx, "app-store", keyGroup) + if err != nil { + return err + } + p.listFactory = factory + return nil +} + +func (p *MyProcessor) Process(ctx fssdk.Context, sourceID uint32, data []byte) error { + userID := parseUserID(data) // []byte — stream key for this record + list, err := p.listFactory.NewKeyedList(userID, []byte{}) // empty namespace when no windows + if err != nil { + return err + } + if err := list.Add(Order{Id: "1", Amount: 100}); err != nil { + return err + } + items, err := list.Get() + if err != nil { + return err + } + // use items... + return nil +} +``` + +### 9.3 MapState and AggregatingState (sum) + +Use the **go-sdk-advanced** `structures` package; `ctx` is `fssdk.Context` from the low-level go-sdk. + +```go +import ( + fssdk "github.com/functionstream/function-stream/go-sdk" + "github.com/functionstream/function-stream/go-sdk-advanced/structures" +) + +// MapState: string -> int64 (both have ordered default codecs) +m, err := structures.NewMapStateFromContextAutoCodec[string, int64](ctx, "counts") +if err != nil { + return err +} +_ = m.Put("a", 1) +v, ok, _ := m.Get("a") + +// AggregatingState: sum of int64 (ACC = int64, R = int64) +type sumAgg struct{} +func (sumAgg) CreateAccumulator() int64 { return 0 } +func (sumAgg) Add(v int64, acc int64) int64 { return acc + v } +func (sumAgg) GetResult(acc int64) int64 { return acc } +func (sumAgg) Merge(a, b int64) int64 { return a + b } + +agg, err := structures.NewAggregatingStateFromContextAutoCodec[int64, int64, int64](ctx, "sum-store", sumAgg{}) +if err != nil { + return err +} +_ = agg.Add(10) +total, _, _ := agg.Get() +``` + +--- + +## 10. See Also + +- [Go SDK Guide](go-sdk-guide.md) — main guide: Driver, Context, Store, build, and deployment. +- [Go SDK — 高级状态 API(中文)](go-sdk-advanced-state-api-zh.md) — 本文档的中文版。 +- [Python SDK — Advanced State API](../Python-SDK/python-sdk-advanced-state-api.md) — equivalent typed state API for the Python SDK. +- [examples/go-processor/README.md](../../examples/go-processor/README.md) — example operator and build instructions. diff --git a/docs/go-sdk-guide-zh.md b/docs/Go-SDK/go-sdk-guide-zh.md similarity index 88% rename from docs/go-sdk-guide-zh.md rename to docs/Go-SDK/go-sdk-guide-zh.md index 749b3ef6..024aa5b9 100644 --- a/docs/go-sdk-guide-zh.md +++ b/docs/Go-SDK/go-sdk-guide-zh.md @@ -263,7 +263,7 @@ create function with ( ); ``` -config.yaml 中需配置 `name`、`type: processor`、`input-groups`、`outputs`(如 Kafka)。详见 [Function 配置](function-configuration-zh.md) 与 [examples/go-processor/README.md](../examples/go-processor/README.md)。 +config.yaml 中需配置 `name`、`type: processor`、`input-groups`、`outputs`(如 Kafka)。详见 [Function 配置](../function-configuration-zh.md) 与 [examples/go-processor/README.md](../../examples/go-processor/README.md)。 --- @@ -301,7 +301,18 @@ if err != nil { --- -## 七、目录结构参考 +## 七、高级状态 API(进阶文档) + +本指南仅覆盖**低阶 go-sdk**(Driver、Context、Store、目录结构)。**高级状态 API**(Codec、ValueState、ListState、MapState、PriorityQueueState、AggregatingState、ReducingState、Keyed\* 工厂与用法)由独立库 **go-sdk-advanced** 提供,完整说明、Codec 约定、构造函数表与示例均在进阶文档中: + +- **[Go SDK — 高级状态 API](go-sdk-advanced-state-api-zh.md)**(中文) +- [Go SDK — Advanced State API](go-sdk-advanced-state-api.md)(英文) + +--- + +## 八、目录结构参考 + +**低阶库 go-sdk**: ```text go-sdk/ @@ -317,8 +328,20 @@ go-sdk/ │ ├── runtime.go │ ├── context.go │ └── store.go +├── state/ +│ └── common/ # 公共辅助(Store 类型别名、DupBytes) ├── wit/ # processor.wit 及依赖(可由 make wit 生成) └── bindings/ # wit-bindgen-go 生成的 Go 代码(make bindings) ``` -更多示例与 SQL 操作见 [examples/go-processor/README.md](../examples/go-processor/README.md)、[SQL CLI 指南](sql-cli-guide-zh.md)。 +**高阶库 go-sdk-advanced**(依赖 go-sdk,含 Codec 与全部状态类型): + +```text +go-sdk-advanced/ +├── go.mod # require go-sdk +├── codec/ # Codec[T]、DefaultCodecFor、内置与 JSON codec +├── structures/ # ValueState、ListState、MapState、PriorityQueue、Aggregating、Reducing +└── keyed/ # Keyed 状态工厂(value、list、map、PQ、aggregating、reducing) +``` + +更多示例与 SQL 操作见 [examples/go-processor/README.md](../../examples/go-processor/README.md)、[SQL CLI 指南](../sql-cli-guide-zh.md)。 diff --git a/docs/go-sdk-guide.md b/docs/Go-SDK/go-sdk-guide.md similarity index 91% rename from docs/go-sdk-guide.md rename to docs/Go-SDK/go-sdk-guide.md index 641e1017..c56c6c60 100644 --- a/docs/go-sdk-guide.md +++ b/docs/Go-SDK/go-sdk-guide.md @@ -263,7 +263,7 @@ create function with ( ); ``` -Configure `name`, `type: processor`, `input-groups`, and `outputs` (e.g. Kafka) in config.yaml. See [Function Configuration](function-configuration.md) and [examples/go-processor/README.md](../examples/go-processor/README.md). +Configure `name`, `type: processor`, `input-groups`, and `outputs` (e.g. Kafka) in config.yaml. See [Function Configuration](../function-configuration.md) and [examples/go-processor/README.md](../../examples/go-processor/README.md). --- @@ -301,7 +301,18 @@ if err != nil { --- -## 7. Directory Layout +## 7. Advanced State API (see advanced doc) + +This guide covers only the **low-level go-sdk** (Driver, Context, Store, directory layout). The **advanced state API** (Codec, ValueState, ListState, MapState, PriorityQueueState, AggregatingState, ReducingState, Keyed\* factories and usage) is provided by a separate library **go-sdk-advanced**. Full reference, codec contract, constructor tables, and examples are in the advanced document: + +- **[Go SDK — Advanced State API](go-sdk-advanced-state-api.md)** (English) +- [Go SDK — 高级状态 API](go-sdk-advanced-state-api-zh.md) (中文) + +--- + +## 8. Directory Layout + +**Low-level library (go-sdk):** ```text go-sdk/ @@ -317,8 +328,10 @@ go-sdk/ │ ├── runtime.go │ ├── context.go │ └── store.go +├── state/ +│ └── common/ # Shared helpers (Store type alias, DupBytes) ├── wit/ # processor.wit and deps (make wit) └── bindings/ # Generated by wit-bindgen-go (make bindings) ``` -For more examples and SQL operations, see [examples/go-processor/README.md](../examples/go-processor/README.md) and the [SQL CLI Guide](sql-cli-guide.md). +For more examples and SQL operations, see [examples/go-processor/README.md](../../examples/go-processor/README.md) and the [SQL CLI Guide](../sql-cli-guide.md). diff --git a/docs/Python-SDK/python-sdk-advanced-state-api-zh.md b/docs/Python-SDK/python-sdk-advanced-state-api-zh.md new file mode 100644 index 00000000..1c172393 --- /dev/null +++ b/docs/Python-SDK/python-sdk-advanced-state-api-zh.md @@ -0,0 +1,174 @@ + + +# Python SDK — 高级状态 API + +本文档介绍 Python SDK 的**高级状态 API**:基于底层 KvStore 的带类型状态抽象(ValueState、ListState、MapState 等),通过 **codec** 序列化,并支持按主键的 **keyed state**。 + +**两个独立库:** 高级状态 API 由 **functionstream-api-advanced** 提供,依赖低阶 **functionstream-api**。安装:`pip install functionstream-api functionstream-api-advanced`。使用时从 `fs_api_advanced` 导入 Codec、ValueState、ListState、MapState 等。 + +| 库 | 包名 | 内容 | +|----|------|------| +| **functionstream-api**(低阶) | `fs_api` | Context(仅 getOrCreateKVStore、getConfig、emit)、KvStore、KvIterator、ComplexKey、错误类。 | +| **functionstream-api-advanced**(高阶) | `fs_api_advanced` | Codec、ValueState、ListState、MapState、PriorityQueueState、AggregatingState、ReducingState、Keyed\* 工厂与状态类型。 | + +--- + +## 1. 概述 + +当需要结构化状态(单值、列表、Map、优先队列、聚合、归约)而不想手写字节编码或 key 布局时,可使用高级状态 API。创建方式有两种:通过**运行时的 Context**(如使用 functionstream-runtime 时 `ctx.getOrCreateValueState(...)`)或通过状态类型上的**类型级构造方法**(推荐,便于复用)。 + +--- + +## 2. 创建状态的两种方式 + +### 2.1 通过 Context(getOrCreate\*) + +使用 **functionstream-api-advanced** 时,运行时的 Context 实现(如 functionstream-runtime 的 WitContext)会提供 `getOrCreateValueState(store_name, codec)`、`getOrCreateValueStateAutoCodec(store_name)` 以及 ListState、MapState、PriorityQueueState、AggregatingState、ReducingState 与所有 Keyed\* 工厂的对应方法,内部委托给下面所述的类型级 `from_context` / `from_context_auto_codec`。 + +### 2.2 通过状态类型(推荐) + +每种状态类型和 keyed 工厂提供: + +- **带 codec:** `XxxState.from_context(ctx, store_name, codec, ...)` +- **AutoCodec:** `XxxState.from_context_auto_codec(ctx, store_name)` 或带可选类型参数,由 SDK 使用默认 codec(如 PickleCodec,或 Map key / PQ 元素所需的有序 codec)。 + +状态实例是轻量的;可在每次 `process` 中创建,或在 driver 中(如 `init`)缓存。同一 store 名称对应同一底层 store。 + +--- + +## 3. 非 Keyed 状态 — 构造方法一览 + +| 状态类型 | 带 codec | AutoCodec | +|--------------------|-----------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------| +| ValueState | `ValueState.from_context(ctx, store_name, codec)` | `ValueState.from_context_auto_codec(ctx, store_name)` | +| ListState | `ListState.from_context(ctx, store_name, codec)` | `ListState.from_context_auto_codec(ctx, store_name)` | +| MapState | `MapState.from_context(ctx, store_name, key_codec, value_codec)` 或 `MapState.from_context_auto_key_codec(ctx, store_name, value_codec)` | — | +| PriorityQueueState | `PriorityQueueState.from_context(ctx, store_name, codec)` | `PriorityQueueState.from_context_auto_codec(ctx, store_name)` | +| AggregatingState | `AggregatingState.from_context(ctx, store_name, acc_codec, agg_func)` | `AggregatingState.from_context_auto_codec(ctx, store_name, agg_func)` | +| ReducingState | `ReducingState.from_context(ctx, store_name, value_codec, reduce_func)` | `ReducingState.from_context_auto_codec(ctx, store_name, reduce_func)` | + +以上均可通过 Context 的 `ctx.getOrCreate*` 方法获得(如 `ctx.getOrCreateValueState(store_name, codec)`),其内部会委托给上述构造方法。 + +--- + +## 4. Keyed 状态 — 工厂与 key_group / key / namespace + +**Keyed 状态面向 keyed 算子。** 流按 key 分区(如 keyBy)时,每个 key 拥有独立状态。可先获取一次**工厂**(通过 context、store 名称、**namespace** 和 **key_group**),再按**主键**(当前记录的流 key)创建状态。 + +### 4.1 key_group、key(主键)与 namespace + +| 概念 | API 参数 | 含义 | +|---------------|------------------------------------------|---------------------------------------------------------| +| **key_group** | 创建工厂时的 `key_group` | **keyed 组**:标识该状态所属分区/组(如一组 “counters”,另一组 “sessions”)。 | +| **key** | 工厂方法参数(如 `new_keyed_value(primary_key, namespace)`) | 当前记录的**流 key 的值**(如用户 ID、分区 key)。不同 key 对应不同状态。 | +| **namespace** | 创建工厂时的 `namespace`(bytes) | **有窗口时**为**窗口标识的 bytes**;**无窗口时**传**空 bytes**(如 `b""`)。 | + +### 4.2 Keyed 工厂构造方法一览 + +| 工厂 | 带 codec | AutoCodec | +|--------------------------------|-----------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------| +| KeyedValueStateFactory | `KeyedValueStateFactory.from_context(ctx, store_name, key_group, value_codec)` | `KeyedValueStateFactory.from_context_auto_codec(ctx, store_name, key_group, value_type=None)` | +| KeyedListStateFactory | `KeyedListStateFactory.from_context(ctx, store_name, key_group, value_codec)` | `KeyedListStateFactory.from_context_auto_codec(ctx, store_name, key_group, value_type=None)` | +| KeyedMapStateFactory | `KeyedMapStateFactory.from_context(ctx, store_name, key_group, map_key_codec, map_value_codec)` | `KeyedMapStateFactory.from_context_auto_codec(ctx, store_name, key_group, map_key_type=None, map_value_type=None)` | +| KeyedPriorityQueueStateFactory | `KeyedPriorityQueueStateFactory.from_context(ctx, store_name, key_group, item_codec)` | `KeyedPriorityQueueStateFactory.from_context_auto_codec(ctx, store_name, key_group, item_type=None)` | +| KeyedAggregatingStateFactory | `KeyedAggregatingStateFactory.from_context(ctx, store_name, key_group, acc_codec, agg_func)` | `KeyedAggregatingStateFactory.from_context_auto_codec(ctx, store_name, key_group, agg_func, acc_type=None)` | +| KeyedReducingStateFactory | `KeyedReducingStateFactory.from_context(ctx, store_name, key_group, value_codec, reduce_func)` | `KeyedReducingStateFactory.from_context_auto_codec(ctx, store_name, key_group, reduce_func, value_type=None)` | + +也可使用 Context 的 `ctx.getOrCreateKeyed*Factory(...)` 方法,其内部会委托给上述构造方法。 + +### 4.3 KeyedValueState + +KeyedValueState 与 Go SDK 一致:工厂仅需 `key_group`(无 namespace)。工厂:`KeyedValueStateFactory.from_context(ctx, store_name, key_group, value_codec)` 或 `from_context_auto_codec(ctx, store_name, key_group, value_type=None)`。创建状态:`factory.new_keyed_value(primary_key, namespace)`(namespace 为 bytes,必填)。状态方法:`update(value)`、`value()`(返回 `(value, found)`)、`clear()`。 + +### 4.4 KeyedListState + +KeyedListState 与 Go SDK 一致:工厂仅需 `key_group`(无 namespace),创建列表时再传入 **key** 与 **namespace**。工厂:`KeyedListStateFactory.from_context(ctx, store_name, key_group, value_codec)` 或 `from_context_auto_codec(ctx, store_name, key_group, value_type=None)`。创建列表:`factory.new_keyed_list(key, namespace)`,得到 `KeyedListState[V]`。状态方法:`add(value)`、`add_all(values)`、`get()`(返回 `List[V]`)、`update(values)`(先清空再整体写入)、`clear()`。 + +### 4.5 KeyedAggregatingState + +KeyedAggregatingState 与 Go SDK 一致:工厂仅需 `key_group`(无 namespace)。工厂:`KeyedAggregatingStateFactory.from_context(ctx, store_name, key_group, acc_codec, agg_func)` 或 `from_context_auto_codec(ctx, store_name, key_group, agg_func, acc_type=None)`。创建状态:`factory.new_aggregating_state(primary_key, state_name="")`,得到绑定到该 (primary_key, namespace=state_name) 的 `KeyedAggregatingState[T, ACC, R]`。状态方法:`add(value)`(向当前状态的 accumulator 合并)、`get()`(返回 `(result, found)`)、`clear()`。 + +### 4.6 KeyedMapState + +KeyedMapState 与 Go SDK 一致:工厂仅需 `key_group`(无 namespace),且 map key 的 codec 必须有序。工厂:`KeyedMapStateFactory.from_context(ctx, store_name, key_group, map_key_codec, map_value_codec)` 或 `from_context_auto_codec(ctx, store_name, key_group, map_key_type=None, map_value_type=None)`。创建 map:`factory.new_keyed_map(primary_key, map_name)`(map_name 必填,转为 namespace),得到 `KeyedMapState[MK, MV]`。状态方法:`put(map_key, value)`、`get(map_key)`(返回 `(value, found)`)、`delete(map_key)`、`clear()`(按前缀删除本 map 全部条目)、`all()`(迭代 `(map_key, value)`)。 + +### 4.7 KeyedPriorityQueueState + +KeyedPriorityQueueState 与 Go SDK 一致:工厂仅需 `key_group`(无 namespace),元素 codec 必须有序。工厂:`KeyedPriorityQueueStateFactory.from_context(ctx, store_name, key_group, item_codec)` 或 `from_context_auto_codec(ctx, store_name, key_group, item_type=None)`。创建队列:`factory.new_keyed_priority_queue(primary_key, namespace)`(primary_key 与 namespace 均必填,bytes),得到 `KeyedPriorityQueueState[V]`。状态方法:`add(value)`、`peek()`(返回 `(min_element, found)`)、`poll()`(取出并返回最小元素)、`clear()`(按前缀删除全部)、`all()`(按序迭代所有元素)。 + +### 4.8 KeyedReducingState + +KeyedReducingState 与 Go SDK 一致:工厂仅需 `key_group`(无 namespace)。工厂:`KeyedReducingStateFactory.from_context(ctx, store_name, key_group, value_codec, reduce_func)` 或 `from_context_auto_codec(ctx, store_name, key_group, reduce_func, value_type=None)`。创建状态:`factory.new_reducing_state(primary_key, namespace)`(两者必填,bytes),得到 `KeyedReducingState[V]`。状态方法:`add(value)`(与当前值经 reduce_func 合并后写入)、`get()`(返回 `(value, found)`)、`clear()`。 + +--- + +## 5. 示例 + +### 5.1 ValueState(from_context_auto_codec) + +从 **fs_api_advanced** 导入 ValueState(Codec、ListState、MapState 等同此包): + +```python +from fs_api import FSProcessorDriver, Context +from fs_api_advanced import ValueState + +class CounterProcessor(FSProcessorDriver): + def process(self, ctx: Context, source_id: int, data: bytes): + state = ValueState.from_context_auto_codec(ctx, "my-store") + cur = state.value() + if cur is None: + cur = 0 + state.update(cur + 1) + ctx.emit(str(cur + 1).encode(), 0) +``` + +### 5.2 KeyedValueState(keyed 算子) + +流按 key 分区时,在 `init` 中创建工厂,在 `process` 中按当前记录的 `primary_key` 取状态,再 `update(value)` / `value()` / `clear()`: + +```python +from fs_api import FSProcessorDriver, Context +from fs_api_advanced import KeyedValueStateFactory + +class KeyedCounterProcessor(FSProcessorDriver): + def init(self, ctx: Context, config: dict): + self._factory = KeyedValueStateFactory.from_context_auto_codec( + ctx, "counters", b"by_key", value_type=int + ) + + def process(self, ctx: Context, source_id: int, data: bytes): + primary_key = data[:8] + state = self._factory.new_keyed_value(primary_key, b"count") + cur, found = state.value() + if not found: + cur = 0 + state.update(cur + 1) + ctx.emit(str(cur + 1).encode(), 0) +``` + +其他状态类型按上表使用 `XxxState.from_context(ctx, store_name, ...)` 或 `XxxState.from_context_auto_codec(ctx, store_name)`。 + +--- + +## 6. 参见 + +- [Python SDK 指南](python-sdk-guide-zh.md) — fs_api、fs_client 及 Context/KvStore 基础用法。 diff --git a/docs/Python-SDK/python-sdk-advanced-state-api.md b/docs/Python-SDK/python-sdk-advanced-state-api.md new file mode 100644 index 00000000..07256cb3 --- /dev/null +++ b/docs/Python-SDK/python-sdk-advanced-state-api.md @@ -0,0 +1,174 @@ + + +# Python SDK — Advanced State API + +This document describes the **high-level state API** for the Python SDK: typed state abstractions (ValueState, ListState, MapState, etc.) built on top of the low-level KvStore, with serialization via **codecs** and optional **keyed state** per primary key. + +**Two separate libraries:** The advanced state API is provided by **functionstream-api-advanced**, which depends on the low-level **functionstream-api**. Install with: `pip install functionstream-api functionstream-api-advanced`. Import Codec, ValueState, ListState, MapState, etc. from `fs_api_advanced`. + +| Library | Package | Contents | +|---------|---------|----------| +| **functionstream-api** (low-level) | `fs_api` | Context (getOrCreateKVStore, getConfig, emit only), KvStore, KvIterator, ComplexKey, error types. | +| **functionstream-api-advanced** (high-level) | `fs_api_advanced` | Codec, ValueState, ListState, MapState, PriorityQueueState, AggregatingState, ReducingState, Keyed\* factories and state types. | + +--- + +## 1. Overview + +Use the advanced state API when you need structured state (single value, list, map, priority queue, aggregation, reduction) without manual byte encoding or key layout. You can create state either from the **runtime Context** (e.g. `ctx.getOrCreateValueState(...)` when using functionstream-runtime) or via **type-level constructors** on the state class (recommended for clarity and reuse). + +--- + +## 2. Creating State: Two Ways + +### 2.1 From Context (getOrCreate\*) + +When using **functionstream-api-advanced**, the runtime Context implementation (e.g. WitContext in functionstream-runtime) provides `getOrCreateValueState(store_name, codec)`, `getOrCreateValueStateAutoCodec(store_name)`, and the same pattern for ListState, MapState, PriorityQueueState, AggregatingState, ReducingState, and all Keyed\* factories; these delegate to the type-level `from_context` / `from_context_auto_codec` methods below. + +### 2.2 From the state type (recommended) + +Each state type and keyed factory provides: + +- **With codec:** `XxxState.from_context(ctx, store_name, codec, ...)` — you pass the codec(s). +- **AutoCodec:** `XxxState.from_context_auto_codec(ctx, store_name)` or with optional type hint — the SDK uses a default codec (e.g. `PickleCodec`, or ordered codecs for map key / PQ element where required). + +State instances are lightweight; you may create them per message in `process` or cache in the driver (e.g. in `init`). Same store name yields the same underlying store. + +--- + +## 3. Non-Keyed State — Constructor Summary + +| State | With codec | AutoCodec | +|--------------------|------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------| +| ValueState | `ValueState.from_context(ctx, store_name, codec)` | `ValueState.from_context_auto_codec(ctx, store_name)` | +| ListState | `ListState.from_context(ctx, store_name, codec)` | `ListState.from_context_auto_codec(ctx, store_name)` | +| MapState | `MapState.from_context(ctx, store_name, key_codec, value_codec)` or `MapState.from_context_auto_key_codec(ctx, store_name, value_codec)` | — | +| PriorityQueueState | `PriorityQueueState.from_context(ctx, store_name, codec)` | `PriorityQueueState.from_context_auto_codec(ctx, store_name)` | +| AggregatingState | `AggregatingState.from_context(ctx, store_name, acc_codec, agg_func)` | `AggregatingState.from_context_auto_codec(ctx, store_name, agg_func)` | +| ReducingState | `ReducingState.from_context(ctx, store_name, value_codec, reduce_func)` | `ReducingState.from_context_auto_codec(ctx, store_name, reduce_func)` | + +All of the above can also be obtained via the corresponding `ctx.getOrCreate*` methods (e.g. `ctx.getOrCreateValueState(store_name, codec)`), which delegate to these constructors. + +--- + +## 4. Keyed State — Factories and keyGroup / key / namespace + +**Keyed state is for keyed operators.** When the stream is partitioned by a key (e.g. after keyBy), each key gets isolated state. You obtain a **factory** once (from context, store name, **namespace**, and **key_group**), then create state **per primary key** (the stream key for the current record). + +### 4.1 keyGroup, key (primaryKey), and namespace + +| Term | API parameter | Meaning | +|---------------|-----------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------| +| **key_group** | `key_group` when creating the factory | The **keyed group**: identifies which keyed partition/group this state belongs to (e.g. one group for "counters", another for "sessions"). | +| **key** | Argument to factory methods (e.g. `new_keyed_value(primary_key, namespace)`) | The **value of the stream key** for the current record (e.g. user ID, partition key). Each distinct key value gets isolated state. | +| **namespace** | `namespace` (bytes) when creating the factory | **If a window function is present**, use the **window identifier as bytes**. **Without windows**, pass **empty bytes** (e.g. `b""`). | + +### 4.2 Factory constructor summary (keyed) + +| Factory | With codec | AutoCodec | +|--------------------------------|-----------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------| +| KeyedValueStateFactory | `KeyedValueStateFactory.from_context(ctx, store_name, key_group, value_codec)` | `KeyedValueStateFactory.from_context_auto_codec(ctx, store_name, key_group, value_type=None)` | +| KeyedListStateFactory | `KeyedListStateFactory.from_context(ctx, store_name, key_group, value_codec)` | `KeyedListStateFactory.from_context_auto_codec(ctx, store_name, key_group, value_type=None)` | +| KeyedMapStateFactory | `KeyedMapStateFactory.from_context(ctx, store_name, key_group, map_key_codec, map_value_codec)` | `KeyedMapStateFactory.from_context_auto_codec(ctx, store_name, key_group, map_key_type=None, map_value_type=None)` | +| KeyedPriorityQueueStateFactory | `KeyedPriorityQueueStateFactory.from_context(ctx, store_name, key_group, item_codec)` | `KeyedPriorityQueueStateFactory.from_context_auto_codec(ctx, store_name, key_group, item_type=None)` | +| KeyedAggregatingStateFactory | `KeyedAggregatingStateFactory.from_context(ctx, store_name, key_group, acc_codec, agg_func)` | `KeyedAggregatingStateFactory.from_context_auto_codec(ctx, store_name, key_group, agg_func, acc_type=None)` | +| KeyedReducingStateFactory | `KeyedReducingStateFactory.from_context(ctx, store_name, key_group, value_codec, reduce_func)` | `KeyedReducingStateFactory.from_context_auto_codec(ctx, store_name, key_group, reduce_func, value_type=None)` | + +You can also use the corresponding `ctx.getOrCreateKeyed*Factory(...)` methods, which delegate to these constructors. + +### 4.3 KeyedValueState + +KeyedValueState aligns with the Go SDK: the factory takes only `key_group` (no namespace). Factory: `KeyedValueStateFactory.from_context(ctx, store_name, key_group, value_codec)` or `from_context_auto_codec(ctx, store_name, key_group, value_type=None)`. Create state: `factory.new_keyed_value(primary_key, namespace)` (namespace is bytes, required). State methods: `update(value)`, `value()` (returns `(value, found)`), `clear()`. + +### 4.4 KeyedListState + +KeyedListState aligns with the Go SDK: the factory takes only `key_group` (no namespace); **key** and **namespace** are passed when creating the list. Factory: `KeyedListStateFactory.from_context(ctx, store_name, key_group, value_codec)` or `from_context_auto_codec(ctx, store_name, key_group, value_type=None)`. Create list: `factory.new_keyed_list(key, namespace)`, yielding `KeyedListState[V]`. State methods: `add(value)`, `add_all(values)`, `get()` (returns `List[V]`), `update(values)` (clears then writes the full list), `clear()`. + +### 4.5 KeyedAggregatingState + +KeyedAggregatingState aligns with the Go SDK: the factory takes only `key_group` (no namespace). Factory: `KeyedAggregatingStateFactory.from_context(ctx, store_name, key_group, acc_codec, agg_func)` or `from_context_auto_codec(ctx, store_name, key_group, agg_func, acc_type=None)`. Create state: `factory.new_aggregating_state(primary_key, state_name="")`, yielding `KeyedAggregatingState[T, ACC, R]` bound to that (primary_key, namespace=state_name). State methods: `add(value)` (merge into this state’s accumulator), `get()` (returns `(result, found)`), `clear()`. + +### 4.6 KeyedMapState + +KeyedMapState aligns with the Go SDK: the factory takes only `key_group` (no namespace), and the map key codec must be ordered. Factory: `KeyedMapStateFactory.from_context(ctx, store_name, key_group, map_key_codec, map_value_codec)` or `from_context_auto_codec(ctx, store_name, key_group, map_key_type=None, map_value_type=None)`. Create map: `factory.new_keyed_map(primary_key, map_name)` (map_name required, used as namespace), yielding `KeyedMapState[MK, MV]`. State methods: `put(map_key, value)`, `get(map_key)` (returns `(value, found)`), `delete(map_key)`, `clear()` (delete all entries in this map by prefix), `all()` (iterate over `(map_key, value)` pairs). + +### 4.7 KeyedPriorityQueueState + +KeyedPriorityQueueState aligns with the Go SDK: the factory takes only `key_group` (no namespace), and the element codec must be ordered. Factory: `KeyedPriorityQueueStateFactory.from_context(ctx, store_name, key_group, item_codec)` or `from_context_auto_codec(ctx, store_name, key_group, item_type=None)`. Create queue: `factory.new_keyed_priority_queue(primary_key, namespace)` (both required, bytes), yielding `KeyedPriorityQueueState[V]`. State methods: `add(value)`, `peek()` (returns `(min_element, found)`), `poll()` (remove and return min), `clear()` (delete all by prefix), `all()` (iterate over all elements in order). + +### 4.8 KeyedReducingState + +KeyedReducingState aligns with the Go SDK: the factory takes only `key_group` (no namespace). Factory: `KeyedReducingStateFactory.from_context(ctx, store_name, key_group, value_codec, reduce_func)` or `from_context_auto_codec(ctx, store_name, key_group, reduce_func, value_type=None)`. Create state: `factory.new_reducing_state(primary_key, namespace)` (both required, bytes), yielding `KeyedReducingState[V]`. State methods: `add(value)` (merge with current value via reduce_func and put), `get()` (returns `(value, found)`), `clear()`. + +--- + +## 5. Examples + +### 5.1 ValueState (from_context_auto_codec) + +Import ValueState from **fs_api_advanced** (Codec, ListState, MapState, etc. are in the same package): + +```python +from fs_api import FSProcessorDriver, Context +from fs_api_advanced import ValueState + +class CounterProcessor(FSProcessorDriver): + def process(self, ctx: Context, source_id: int, data: bytes): + state = ValueState.from_context_auto_codec(ctx, "my-store") + cur = state.value() + if cur is None: + cur = 0 + state.update(cur + 1) + ctx.emit(str(cur + 1).encode(), 0) +``` + +### 5.2 KeyedValueState (keyed operator) + +When the stream is partitioned by key, create the factory in `init` and obtain state per record’s `primary_key` in `process`, then use `update(value)` / `value()` / `clear()`: + +```python +from fs_api import FSProcessorDriver, Context +from fs_api_advanced import KeyedValueStateFactory + +class KeyedCounterProcessor(FSProcessorDriver): + def init(self, ctx: Context, config: dict): + self._factory = KeyedValueStateFactory.from_context_auto_codec( + ctx, "counters", b"by_key", value_type=int + ) + + def process(self, ctx: Context, source_id: int, data: bytes): + primary_key = data[:8] + state = self._factory.new_keyed_value(primary_key, b"count") + cur, found = state.value() + if not found: + cur = 0 + state.update(cur + 1) + ctx.emit(str(cur + 1).encode(), 0) +``` + +Same pattern for other state types: use `XxxState.from_context(ctx, store_name, ...)` or `XxxState.from_context_auto_codec(ctx, store_name)` as in the tables above. + +--- + +## 6. See also + +- [Python SDK Guide](python-sdk-guide.md) — main guide for fs_api, fs_client, and basic Context/KvStore usage. diff --git a/docs/python-sdk-guide-zh.md b/docs/Python-SDK/python-sdk-guide-zh.md similarity index 86% rename from docs/python-sdk-guide-zh.md rename to docs/Python-SDK/python-sdk-guide-zh.md index 26fe31fc..966c1a0e 100644 --- a/docs/python-sdk-guide-zh.md +++ b/docs/Python-SDK/python-sdk-guide-zh.md @@ -27,11 +27,11 @@ Function Stream 为 Python 开发者提供了一套完整的工具链,涵盖 ## 一、SDK 核心组件定义 -| 包名 | 定位 | 核心功能 | -|------|------|----------| -| fs_api | 算子开发接口 | 定义 Processor 逻辑,提供 KV 状态存取及数据发射 (Emit) 能力。 | -| fs_client | 集群控制客户端 | 基于 gRPC,实现函数的远程注册、状态控制及拓扑配置。 | -| fs_runtime | 内建运行时(内部使用) | 封装了在 WASM 隔离环境中的 Python 解释器行为。 | +| 包名 | 定位 | 核心功能 | +|------------|-------------|--------------------------------------------| +| fs_api | 算子开发接口 | 定义 Processor 逻辑,提供 KV 状态存取及数据发射 (Emit) 能力。 | +| fs_client | 集群控制客户端 | 基于 gRPC,实现函数的远程注册、状态控制及拓扑配置。 | +| fs_runtime | 内建运行时(内部使用) | 封装了在 WASM 隔离环境中的 Python 解释器行为。 | --- @@ -63,6 +63,8 @@ Function Stream 的强大之处在于其内建的本地状态管理。 - 支持基础的 `put_state` / `get_state`。 - 进阶支持 ComplexKey(复杂键)操作,适用于多维索引或前缀扫描场景。 +**低阶与高阶库:** 上述 Context、KvStore 属于低阶 **functionstream-api**。若需带类型的 Codec、ValueState、ListState、MapState、Keyed\* 等高级状态 API,请使用独立库 **functionstream-api-advanced**(依赖 functionstream-api),并从 `fs_api_advanced` 导入,详见 [Python SDK — 高级状态 API](python-sdk-advanced-state-api-zh.md)。 + ### 2.3 生产级代码示例 ```python diff --git a/docs/python-sdk-guide.md b/docs/Python-SDK/python-sdk-guide.md similarity index 95% rename from docs/python-sdk-guide.md rename to docs/Python-SDK/python-sdk-guide.md index 29bb0d37..98ce7d47 100644 --- a/docs/python-sdk-guide.md +++ b/docs/Python-SDK/python-sdk-guide.md @@ -63,6 +63,8 @@ The power of Function Stream lies in its built-in local state management. - Supports basic `put_state` / `get_state`. - Advanced support for ComplexKey operations, suitable for multi-dimensional indexing or prefix scanning scenarios. +**Low-level vs advanced libraries:** The Context and KvStore above belong to the low-level **functionstream-api**. For typed Codec, ValueState, ListState, MapState, Keyed\* state, etc., use the separate library **functionstream-api-advanced** (depends on functionstream-api) and import from `fs_api_advanced`. See [Python SDK — Advanced State API](python-sdk-advanced-state-api.md). + ### 2.3 Production-Grade Code Example ```python @@ -148,3 +150,4 @@ with FsClient(host="10.0.0.1", port=8080) as client: | BadRequestError (400) | YAML configuration does not meet specifications or Kafka parameters are incorrect | Check configuration items in WasmTaskBuilder. | | ServerError (500) | Server-side runtime environment (e.g., RocksDB) exception | Check permissions of storage path in server conf/config.yaml. | | NotFoundError (404) | Operating on a non-existent function or invalid Checkpoint | Confirm if the function name is correct. | + diff --git a/go-sdk-advanced/codec/bool_codec.go b/go-sdk-advanced/codec/bool_codec.go new file mode 100644 index 00000000..d18fbe63 --- /dev/null +++ b/go-sdk-advanced/codec/bool_codec.go @@ -0,0 +1,42 @@ +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package codec + +import "fmt" + +type BoolCodec struct{} + +func (c BoolCodec) Encode(value bool) ([]byte, error) { + if value { + return []byte{1}, nil + } + return []byte{0}, nil +} + +func (c BoolCodec) Decode(data []byte) (bool, error) { + if len(data) != 1 { + return false, fmt.Errorf("invalid bool payload length: %d", len(data)) + } + switch data[0] { + case 0: + return false, nil + case 1: + return true, nil + default: + return false, fmt.Errorf("invalid bool payload byte: %d", data[0]) + } +} + +func (c BoolCodec) EncodedSize() int { return 1 } + +func (c BoolCodec) IsOrderedKeyCodec() bool { return true } diff --git a/go-sdk-advanced/codec/default_codec.go b/go-sdk-advanced/codec/default_codec.go new file mode 100644 index 00000000..984a050a --- /dev/null +++ b/go-sdk-advanced/codec/default_codec.go @@ -0,0 +1,61 @@ +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package codec + +import ( + "fmt" + "reflect" +) + +func DefaultCodecFor[V any]() (Codec[V], error) { + var zero V + t := reflect.TypeOf(zero) + if t == nil { + return nil, fmt.Errorf("default codec: type parameter V must not be interface without type constraint") + } + k := t.Kind() + switch k { + case reflect.Bool: + return any(BoolCodec{}).(Codec[V]), nil + case reflect.Int32: + return any(Int32Codec{}).(Codec[V]), nil + case reflect.Int64: + return any(Int64Codec{}).(Codec[V]), nil + case reflect.Uint32: + return any(Uint32Codec{}).(Codec[V]), nil + case reflect.Uint64: + return any(Uint64Codec{}).(Codec[V]), nil + case reflect.Float32: + return any(Float32Codec{}).(Codec[V]), nil + case reflect.Float64: + return any(Float64Codec{}).(Codec[V]), nil + case reflect.String: + return any(StringCodec{}).(Codec[V]), nil + case reflect.Int: + return any(IntCodec{}).(Codec[V]), nil + case reflect.Int8: + return any(Int8Codec{}).(Codec[V]), nil + case reflect.Int16: + return any(Int16Codec{}).(Codec[V]), nil + case reflect.Uint: + return any(UintCodec{}).(Codec[V]), nil + case reflect.Uint8: + return any(Uint8Codec{}).(Codec[V]), nil + case reflect.Uint16: + return any(Uint16Codec{}).(Codec[V]), nil + case reflect.Struct, reflect.Map, reflect.Slice, reflect.Array, reflect.Interface: + return any(JSONCodec[V]{}).(Codec[V]), nil + default: + return any(JSONCodec[V]{}).(Codec[V]), nil + } +} diff --git a/go-sdk-advanced/codec/float32_codec.go b/go-sdk-advanced/codec/float32_codec.go new file mode 100644 index 00000000..139f76ec --- /dev/null +++ b/go-sdk-advanced/codec/float32_codec.go @@ -0,0 +1,50 @@ +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package codec + +import ( + "encoding/binary" + "fmt" + "math" +) + +type Float32Codec struct{} + +func (c Float32Codec) Encode(value float32) ([]byte, error) { + bits := math.Float32bits(value) + if (bits & (uint32(1) << 31)) != 0 { + bits = ^bits + } else { + bits ^= uint32(1) << 31 + } + out := make([]byte, 4) + binary.BigEndian.PutUint32(out, bits) + return out, nil +} + +func (c Float32Codec) Decode(data []byte) (float32, error) { + if len(data) != 4 { + return 0, fmt.Errorf("invalid float32 payload length: %d", len(data)) + } + encoded := binary.BigEndian.Uint32(data) + if (encoded & (uint32(1) << 31)) != 0 { + encoded ^= uint32(1) << 31 + } else { + encoded = ^encoded + } + return math.Float32frombits(encoded), nil +} + +func (c Float32Codec) EncodedSize() int { return 4 } + +func (c Float32Codec) IsOrderedKeyCodec() bool { return true } diff --git a/go-sdk-advanced/codec/float64_codec.go b/go-sdk-advanced/codec/float64_codec.go new file mode 100644 index 00000000..895fb810 --- /dev/null +++ b/go-sdk-advanced/codec/float64_codec.go @@ -0,0 +1,50 @@ +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package codec + +import ( + "encoding/binary" + "fmt" + "math" +) + +type Float64Codec struct{} + +func (c Float64Codec) Encode(value float64) ([]byte, error) { + bits := math.Float64bits(value) + if (bits & (uint64(1) << 63)) != 0 { + bits = ^bits + } else { + bits ^= uint64(1) << 63 + } + out := make([]byte, 8) + binary.BigEndian.PutUint64(out, bits) + return out, nil +} + +func (c Float64Codec) Decode(data []byte) (float64, error) { + if len(data) != 8 { + return 0, fmt.Errorf("invalid float64 payload length: %d", len(data)) + } + encoded := binary.BigEndian.Uint64(data) + if (encoded & (uint64(1) << 63)) != 0 { + encoded ^= uint64(1) << 63 + } else { + encoded = ^encoded + } + return math.Float64frombits(encoded), nil +} + +func (c Float64Codec) EncodedSize() int { return 8 } + +func (c Float64Codec) IsOrderedKeyCodec() bool { return true } diff --git a/go-sdk-advanced/codec/int16_codec.go b/go-sdk-advanced/codec/int16_codec.go new file mode 100644 index 00000000..f678f792 --- /dev/null +++ b/go-sdk-advanced/codec/int16_codec.go @@ -0,0 +1,37 @@ +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package codec + +import ( + "encoding/binary" + "fmt" +) + +type Int16Codec struct{} + +func (c Int16Codec) Encode(value int16) ([]byte, error) { + out := make([]byte, 2) + binary.BigEndian.PutUint16(out, uint16(value)^(uint16(1)<<15)) + return out, nil +} + +func (c Int16Codec) Decode(data []byte) (int16, error) { + if len(data) != 2 { + return 0, fmt.Errorf("invalid int16 payload length: %d", len(data)) + } + return int16(binary.BigEndian.Uint16(data) ^ (uint16(1) << 15)), nil +} + +func (c Int16Codec) EncodedSize() int { return 2 } + +func (c Int16Codec) IsOrderedKeyCodec() bool { return true } diff --git a/go-sdk-advanced/codec/int32_codec.go b/go-sdk-advanced/codec/int32_codec.go new file mode 100644 index 00000000..03792712 --- /dev/null +++ b/go-sdk-advanced/codec/int32_codec.go @@ -0,0 +1,37 @@ +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package codec + +import ( + "encoding/binary" + "fmt" +) + +type Int32Codec struct{} + +func (c Int32Codec) Encode(value int32) ([]byte, error) { + out := make([]byte, 4) + binary.BigEndian.PutUint32(out, uint32(value)^(uint32(1)<<31)) + return out, nil +} + +func (c Int32Codec) Decode(data []byte) (int32, error) { + if len(data) != 4 { + return 0, fmt.Errorf("invalid int32 payload length: %d", len(data)) + } + return int32(binary.BigEndian.Uint32(data) ^ (uint32(1) << 31)), nil +} + +func (c Int32Codec) EncodedSize() int { return 4 } + +func (c Int32Codec) IsOrderedKeyCodec() bool { return true } diff --git a/go-sdk-advanced/codec/int64_codec.go b/go-sdk-advanced/codec/int64_codec.go new file mode 100644 index 00000000..cb632bda --- /dev/null +++ b/go-sdk-advanced/codec/int64_codec.go @@ -0,0 +1,40 @@ +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package codec + +import ( + "encoding/binary" + "fmt" +) + +type Int64Codec struct{} + +func (c Int64Codec) Encode(value int64) ([]byte, error) { + out := make([]byte, 8) + mapped := uint64(value) ^ (uint64(1) << 63) + binary.BigEndian.PutUint64(out, mapped) + return out, nil +} + +func (c Int64Codec) Decode(data []byte) (int64, error) { + if len(data) != 8 { + return 0, fmt.Errorf("invalid int64 payload length: %d", len(data)) + } + mapped := binary.BigEndian.Uint64(data) + raw := int64(mapped ^ (uint64(1) << 63)) + return raw, nil +} + +func (c Int64Codec) EncodedSize() int { return 8 } + +func (c Int64Codec) IsOrderedKeyCodec() bool { return true } diff --git a/go-sdk-advanced/codec/int8_codec.go b/go-sdk-advanced/codec/int8_codec.go new file mode 100644 index 00000000..fac2dd81 --- /dev/null +++ b/go-sdk-advanced/codec/int8_codec.go @@ -0,0 +1,34 @@ +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package codec + +import ( + "fmt" +) + +type Int8Codec struct{} + +func (c Int8Codec) Encode(value int8) ([]byte, error) { + return []byte{byte(uint8(value) ^ (1 << 7))}, nil +} + +func (c Int8Codec) Decode(data []byte) (int8, error) { + if len(data) != 1 { + return 0, fmt.Errorf("invalid int8 payload length: %d", len(data)) + } + return int8(data[0] ^ (1 << 7)), nil +} + +func (c Int8Codec) EncodedSize() int { return 1 } + +func (c Int8Codec) IsOrderedKeyCodec() bool { return true } diff --git a/go-sdk-advanced/codec/int_codec.go b/go-sdk-advanced/codec/int_codec.go new file mode 100644 index 00000000..402ecbbc --- /dev/null +++ b/go-sdk-advanced/codec/int_codec.go @@ -0,0 +1,33 @@ +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package codec + +// IntCodec implements Codec[int] by delegating to Int64Codec. +// It is used for reflect.Int (platform-sized int) so that DefaultCodecFor[int]() +// returns a valid Codec[int] instead of panicking on type assertion. +type IntCodec struct{} + +var _ Codec[int] = IntCodec{} + +func (c IntCodec) Encode(value int) ([]byte, error) { + return Int64Codec{}.Encode(int64(value)) +} + +func (c IntCodec) Decode(data []byte) (int, error) { + v, err := Int64Codec{}.Decode(data) + return int(v), err +} + +func (c IntCodec) EncodedSize() int { return 8 } + +func (c IntCodec) IsOrderedKeyCodec() bool { return true } diff --git a/go-sdk-advanced/codec/interface.go b/go-sdk-advanced/codec/interface.go new file mode 100644 index 00000000..2e35b4f7 --- /dev/null +++ b/go-sdk-advanced/codec/interface.go @@ -0,0 +1,28 @@ +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package codec + +// Codec is the core encoding interface. EncodedSize reports encoded byte length: +// >0 for fixed-size, <=0 for variable-size. +// IsOrderedKeyCodec reports whether the encoding is byte-orderable (for use as map/keyed state key). +type Codec[T any] interface { + Encode(value T) ([]byte, error) + Decode(data []byte) (T, error) + EncodedSize() int + IsOrderedKeyCodec() bool +} + +func FixedEncodedSize[T any](c Codec[T]) (int, bool) { + n := c.EncodedSize() + return n, n > 0 +} diff --git a/go-sdk-advanced/codec/json_codec.go b/go-sdk-advanced/codec/json_codec.go new file mode 100644 index 00000000..abf65b8d --- /dev/null +++ b/go-sdk-advanced/codec/json_codec.go @@ -0,0 +1,30 @@ +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package codec + +import "encoding/json" + +type JSONCodec[T any] struct{} + +func (c JSONCodec[T]) Encode(value T) ([]byte, error) { return json.Marshal(value) } + +func (c JSONCodec[T]) Decode(data []byte) (T, error) { + var out T + if err := json.Unmarshal(data, &out); err != nil { + return out, err + } + return out, nil +} + +func (c JSONCodec[T]) EncodedSize() int { return -1 } +func (c JSONCodec[T]) IsOrderedKeyCodec() bool { return false } diff --git a/go-sdk-advanced/codec/string_codec.go b/go-sdk-advanced/codec/string_codec.go new file mode 100644 index 00000000..0f4f2e21 --- /dev/null +++ b/go-sdk-advanced/codec/string_codec.go @@ -0,0 +1,23 @@ +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package codec + +type StringCodec struct{} + +func (c StringCodec) Encode(value string) ([]byte, error) { return []byte(value), nil } + +func (c StringCodec) Decode(data []byte) (string, error) { return string(data), nil } + +func (c StringCodec) EncodedSize() int { return -1 } + +func (c StringCodec) IsOrderedKeyCodec() bool { return true } diff --git a/go-sdk-advanced/codec/uint16_codec.go b/go-sdk-advanced/codec/uint16_codec.go new file mode 100644 index 00000000..524553a5 --- /dev/null +++ b/go-sdk-advanced/codec/uint16_codec.go @@ -0,0 +1,37 @@ +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package codec + +import ( + "encoding/binary" + "fmt" +) + +type Uint16Codec struct{} + +func (c Uint16Codec) Encode(value uint16) ([]byte, error) { + out := make([]byte, 2) + binary.BigEndian.PutUint16(out, value) + return out, nil +} + +func (c Uint16Codec) Decode(data []byte) (uint16, error) { + if len(data) != 2 { + return 0, fmt.Errorf("invalid uint16 payload length: %d", len(data)) + } + return binary.BigEndian.Uint16(data), nil +} + +func (c Uint16Codec) EncodedSize() int { return 2 } + +func (c Uint16Codec) IsOrderedKeyCodec() bool { return true } diff --git a/go-sdk-advanced/codec/uint32_codec.go b/go-sdk-advanced/codec/uint32_codec.go new file mode 100644 index 00000000..a9555278 --- /dev/null +++ b/go-sdk-advanced/codec/uint32_codec.go @@ -0,0 +1,37 @@ +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package codec + +import ( + "encoding/binary" + "fmt" +) + +type Uint32Codec struct{} + +func (c Uint32Codec) Encode(value uint32) ([]byte, error) { + out := make([]byte, 4) + binary.BigEndian.PutUint32(out, value) + return out, nil +} + +func (c Uint32Codec) Decode(data []byte) (uint32, error) { + if len(data) != 4 { + return 0, fmt.Errorf("invalid uint32 payload length: %d", len(data)) + } + return binary.BigEndian.Uint32(data), nil +} + +func (c Uint32Codec) EncodedSize() int { return 4 } + +func (c Uint32Codec) IsOrderedKeyCodec() bool { return true } diff --git a/go-sdk-advanced/codec/uint64_codec.go b/go-sdk-advanced/codec/uint64_codec.go new file mode 100644 index 00000000..485df695 --- /dev/null +++ b/go-sdk-advanced/codec/uint64_codec.go @@ -0,0 +1,37 @@ +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package codec + +import ( + "encoding/binary" + "fmt" +) + +type Uint64Codec struct{} + +func (c Uint64Codec) Encode(value uint64) ([]byte, error) { + out := make([]byte, 8) + binary.BigEndian.PutUint64(out, value) + return out, nil +} + +func (c Uint64Codec) Decode(data []byte) (uint64, error) { + if len(data) != 8 { + return 0, fmt.Errorf("invalid uint64 payload length: %d", len(data)) + } + return binary.BigEndian.Uint64(data), nil +} + +func (c Uint64Codec) EncodedSize() int { return 8 } + +func (c Uint64Codec) IsOrderedKeyCodec() bool { return true } diff --git a/go-sdk-advanced/codec/uint8_codec.go b/go-sdk-advanced/codec/uint8_codec.go new file mode 100644 index 00000000..45e1ce36 --- /dev/null +++ b/go-sdk-advanced/codec/uint8_codec.go @@ -0,0 +1,34 @@ +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package codec + +import ( + "fmt" +) + +type Uint8Codec struct{} + +func (c Uint8Codec) Encode(value uint8) ([]byte, error) { + return []byte{byte(value)}, nil +} + +func (c Uint8Codec) Decode(data []byte) (uint8, error) { + if len(data) != 1 { + return 0, fmt.Errorf("invalid uint8 payload length: %d", len(data)) + } + return data[0], nil +} + +func (c Uint8Codec) EncodedSize() int { return 1 } + +func (c Uint8Codec) IsOrderedKeyCodec() bool { return true } diff --git a/go-sdk-advanced/codec/uint_codec.go b/go-sdk-advanced/codec/uint_codec.go new file mode 100644 index 00000000..b2d618df --- /dev/null +++ b/go-sdk-advanced/codec/uint_codec.go @@ -0,0 +1,33 @@ +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package codec + +// UintCodec implements Codec[uint] by delegating to Uint64Codec. +// It is used for reflect.Uint (platform-sized uint) so that DefaultCodecFor[uint]() +// returns a valid Codec[uint] instead of panicking on type assertion. +type UintCodec struct{} + +var _ Codec[uint] = UintCodec{} + +func (c UintCodec) Encode(value uint) ([]byte, error) { + return Uint64Codec{}.Encode(uint64(value)) +} + +func (c UintCodec) Decode(data []byte) (uint, error) { + v, err := Uint64Codec{}.Decode(data) + return uint(v), err +} + +func (c UintCodec) EncodedSize() int { return 8 } + +func (c UintCodec) IsOrderedKeyCodec() bool { return true } diff --git a/go-sdk-advanced/go.mod b/go-sdk-advanced/go.mod new file mode 100644 index 00000000..550f4782 --- /dev/null +++ b/go-sdk-advanced/go.mod @@ -0,0 +1,19 @@ +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +module github.com/functionstream/function-stream/go-sdk-advanced + +go 1.23.0 + +require github.com/functionstream/function-stream/go-sdk v0.0.0 + +replace github.com/functionstream/function-stream/go-sdk => ../go-sdk diff --git a/go-sdk-advanced/keyed/keyed_aggregating_state.go b/go-sdk-advanced/keyed/keyed_aggregating_state.go new file mode 100644 index 00000000..12912caf --- /dev/null +++ b/go-sdk-advanced/keyed/keyed_aggregating_state.go @@ -0,0 +1,156 @@ +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package keyed + +import ( + "fmt" + + "github.com/functionstream/function-stream/go-sdk-advanced/codec" + "github.com/functionstream/function-stream/go-sdk/api" + "github.com/functionstream/function-stream/go-sdk/state/common" +) + +type AggregateFunc[T any, ACC any, R any] interface { + CreateAccumulator() ACC + Add(value T, accumulator ACC) ACC + GetResult(accumulator ACC) R + Merge(a ACC, b ACC) ACC +} + +type KeyedAggregatingStateFactory[T any, ACC any, R any] struct { + store common.Store + groupKey []byte + accCodec codec.Codec[ACC] + aggFunc AggregateFunc[T, ACC, R] +} + +// NewKeyedAggregatingStateFactoryFromContext creates a KeyedAggregatingStateFactory using the store from ctx.GetOrCreateStore(storeName). +func NewKeyedAggregatingStateFactoryFromContext[T any, ACC any, R any](ctx api.Context, storeName string, keyGroup []byte, accCodec codec.Codec[ACC], aggFunc AggregateFunc[T, ACC, R]) (*KeyedAggregatingStateFactory[T, ACC, R], error) { + store, err := ctx.GetOrCreateStore(storeName) + if err != nil { + return nil, err + } + return newKeyedAggregatingStateFactory(store, keyGroup, accCodec, aggFunc) +} + +// NewKeyedAggregatingStateFactoryFromContextAutoCodec creates a KeyedAggregatingStateFactory with default accumulator codec from ctx.GetOrCreateStore(storeName). +func NewKeyedAggregatingStateFactoryFromContextAutoCodec[T any, ACC any, R any](ctx api.Context, storeName string, keyGroup []byte, aggFunc AggregateFunc[T, ACC, R]) (*KeyedAggregatingStateFactory[T, ACC, R], error) { + store, err := ctx.GetOrCreateStore(storeName) + if err != nil { + return nil, err + } + accCodec, err := codec.DefaultCodecFor[ACC]() + if err != nil { + return nil, err + } + return newKeyedAggregatingStateFactory(store, keyGroup, accCodec, aggFunc) +} + +func newKeyedAggregatingStateFactory[T any, ACC any, R any]( + store common.Store, + keyGroup []byte, + accCodec codec.Codec[ACC], + aggFunc AggregateFunc[T, ACC, R], +) (*KeyedAggregatingStateFactory[T, ACC, R], error) { + + if store == nil { + return nil, api.NewError(api.ErrStoreInternal, "keyed aggregating state factory store must not be nil") + } + if keyGroup == nil { + return nil, api.NewError(api.ErrStoreInternal, "keyed aggregating state factory key_group must not be nil") + } + if accCodec == nil { + return nil, api.NewError(api.ErrStoreInternal, "keyed aggregating state factory acc_codec must not be nil") + } + if aggFunc == nil { + return nil, api.NewError(api.ErrStoreInternal, "keyed aggregating state factory agg_func must not be nil") + } + + return &KeyedAggregatingStateFactory[T, ACC, R]{ + store: store, + groupKey: common.DupBytes(keyGroup), + accCodec: accCodec, + aggFunc: aggFunc, + }, nil +} + +func (f *KeyedAggregatingStateFactory[T, ACC, R]) NewAggregatingState(primaryKey []byte, stateName string) (*KeyedAggregatingState[T, ACC, R], error) { + return &KeyedAggregatingState[T, ACC, R]{ + factory: f, + primaryKey: common.DupBytes(primaryKey), + namespace: []byte(stateName), + }, nil +} + +type KeyedAggregatingState[T any, ACC any, R any] struct { + factory *KeyedAggregatingStateFactory[T, ACC, R] + primaryKey []byte + namespace []byte +} + +func (s *KeyedAggregatingState[T, ACC, R]) buildCK() api.ComplexKey { + return api.ComplexKey{ + KeyGroup: s.factory.groupKey, + Key: s.primaryKey, + Namespace: s.namespace, + UserKey: []byte{}, + } +} + +func (s *KeyedAggregatingState[T, ACC, R]) Add(value T) error { + ck := s.buildCK() + + raw, found, err := s.factory.store.Get(ck) + if err != nil { + return fmt.Errorf("failed to get accumulator: %w", err) + } + + var acc ACC + if !found { + acc = s.factory.aggFunc.CreateAccumulator() + } else { + var err error + acc, err = s.factory.accCodec.Decode(raw) + if err != nil { + return fmt.Errorf("failed to decode accumulator: %w", err) + } + } + + newAcc := s.factory.aggFunc.Add(value, acc) + + encoded, err := s.factory.accCodec.Encode(newAcc) + if err != nil { + return fmt.Errorf("failed to encode new accumulator: %w", err) + } + return s.factory.store.Put(ck, encoded) +} + +func (s *KeyedAggregatingState[T, ACC, R]) Get() (R, bool, error) { + var zero R + ck := s.buildCK() + raw, found, err := s.factory.store.Get(ck) + if err != nil || !found { + return zero, found, err + } + + acc, err := s.factory.accCodec.Decode(raw) + if err != nil { + return zero, false, err + } + + return s.factory.aggFunc.GetResult(acc), true, nil +} + +func (s *KeyedAggregatingState[T, ACC, R]) Clear() error { + return s.factory.store.Delete(s.buildCK()) +} diff --git a/go-sdk-advanced/keyed/keyed_list_state.go b/go-sdk-advanced/keyed/keyed_list_state.go new file mode 100644 index 00000000..44a00a06 --- /dev/null +++ b/go-sdk-advanced/keyed/keyed_list_state.go @@ -0,0 +1,286 @@ +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package keyed + +import ( + "encoding/binary" + "fmt" + + "github.com/functionstream/function-stream/go-sdk-advanced/codec" + "github.com/functionstream/function-stream/go-sdk/api" + "github.com/functionstream/function-stream/go-sdk/state/common" +) + +type KeyedListStateFactory[V any] struct { + store common.Store + keyGroup []byte + fixedSize int + valueCodec codec.Codec[V] + isFixed bool +} + +// NewKeyedListStateFactoryFromContext creates a KeyedListStateFactory using the store from ctx.GetOrCreateStore(storeName). +func NewKeyedListStateFactoryFromContext[V any](ctx api.Context, storeName string, keyGroup []byte, valueCodec codec.Codec[V]) (*KeyedListStateFactory[V], error) { + store, err := ctx.GetOrCreateStore(storeName) + if err != nil { + return nil, err + } + return newKeyedListStateFactory(store, keyGroup, valueCodec) +} + +// NewKeyedListStateFactoryAutoCodecFromContext creates a KeyedListStateFactory with default value codec using the store from context. +func NewKeyedListStateFactoryAutoCodecFromContext[V any](ctx api.Context, storeName string, keyGroup []byte) (*KeyedListStateFactory[V], error) { + store, err := ctx.GetOrCreateStore(storeName) + if err != nil { + return nil, err + } + return newKeyedListStateFactoryAutoCodec[V](store, keyGroup) +} + +func newKeyedListStateFactory[V any](store common.Store, keyGroup []byte, valueCodec codec.Codec[V]) (*KeyedListStateFactory[V], error) { + if store == nil { + return nil, api.NewError(api.ErrStoreInternal, "keyed list state factory store must not be nil") + } + if keyGroup == nil { + return nil, api.NewError(api.ErrStoreInternal, "keyed list state factory key group must not be nil") + } + if valueCodec == nil { + return nil, api.NewError(api.ErrStoreInternal, "keyed list value codec must not be nil") + } + fixedSize, isFixed := codec.FixedEncodedSize[V](valueCodec) + return &KeyedListStateFactory[V]{ + store: store, + keyGroup: common.DupBytes(keyGroup), + fixedSize: fixedSize, + valueCodec: valueCodec, + isFixed: isFixed, + }, nil +} + +func newKeyedListStateFactoryAutoCodec[V any](store common.Store, keyGroup []byte) (*KeyedListStateFactory[V], error) { + valueCodec, err := codec.DefaultCodecFor[V]() + if err != nil { + return nil, err + } + return newKeyedListStateFactory[V](store, keyGroup, valueCodec) +} + +type KeyedListState[V any] struct { + factory *KeyedListStateFactory[V] + complexKey api.ComplexKey + fixedSize int + valueCodec codec.Codec[V] + serialize func(V) ([]byte, error) + serializeBatch func([]V) ([]byte, error) + decode func([]byte) ([]V, error) +} + +func newKeyedListFromFactory[V any](f *KeyedListStateFactory[V], key []byte, namespace []byte) (*KeyedListState[V], error) { + if f == nil { + return nil, api.NewError(api.ErrStoreInternal, "keyed list factory must not be nil") + } + if key == nil { + return nil, api.NewError(api.ErrStoreInternal, "keyed list key must not be nil") + } + if namespace == nil { + return nil, api.NewError(api.ErrStoreInternal, "keyed list namespace must not be nil") + } + s := &KeyedListState[V]{ + factory: f, + valueCodec: f.valueCodec, + complexKey: api.ComplexKey{ + KeyGroup: f.keyGroup, + Key: key, + Namespace: namespace, + UserKey: []byte{}, + }, + fixedSize: f.fixedSize, + } + if f.isFixed { + s.serialize = s.serializeValueFixed + s.serializeBatch = s.serializeValuesFixedBatch + s.decode = s.deserializeValuesFixed + } else { + s.serialize = s.serializeValueVarLen + s.serializeBatch = s.serializeValuesVarLenBatch + s.decode = s.deserializeValuesVarLen + } + return s, nil +} + +func NewKeyedListFromFactory[V any](f *KeyedListStateFactory[V], key []byte, namespace []byte) (*KeyedListState[V], error) { + return newKeyedListFromFactory[V](f, key, namespace) +} + +func (s *KeyedListState[V]) Add(value V) error { + payload, err := s.serialize(value) + if err != nil { + return err + } + return s.factory.store.Merge(s.complexKey, payload) +} + +func (s *KeyedListState[V]) AddAll(values []V) error { + payload, err := s.serializeBatch(values) + if err != nil { + return err + } + if err := s.factory.store.Merge(s.complexKey, payload); err != nil { + return err + } + return nil +} + +func (s *KeyedListState[V]) Get() ([]V, error) { + raw, found, err := s.factory.store.Get(s.complexKey) + if err != nil { + return nil, err + } + if !found { + return []V{}, nil + } + return s.decode(raw) +} + +// Update replaces the list with the given values (one Put with batch payload). +func (s *KeyedListState[V]) Update(values []V) error { + if err := s.Clear(); err != nil { + return err + } + payload, err := s.serializeBatch(values) + if err != nil { + return err + } + return s.factory.store.Put(s.complexKey, payload) +} + +func (s *KeyedListState[V]) Clear() error { + return s.factory.store.Delete(s.complexKey) +} + +func (s *KeyedListState[V]) serializeValueVarLen(value V) ([]byte, error) { + encoded, err := s.valueCodec.Encode(value) + if err != nil { + return nil, fmt.Errorf("encode keyed list value failed: %w", err) + } + out := make([]byte, 4, 4+len(encoded)) + binary.BigEndian.PutUint32(out, uint32(len(encoded))) + out = append(out, encoded...) + return out, nil +} + +func (s *KeyedListState[V]) serializeValuesVarLenBatch(values []V) ([]byte, error) { + total := 0 + encodedValues := make([][]byte, 0, len(values)) + for _, value := range values { + encoded, err := s.valueCodec.Encode(value) + if err != nil { + return nil, fmt.Errorf("encode keyed list value failed: %w", err) + } + encodedValues = append(encodedValues, encoded) + total += 4 + len(encoded) + } + out := make([]byte, 0, total) + for _, encoded := range encodedValues { + var lenBuf [4]byte + binary.BigEndian.PutUint32(lenBuf[:], uint32(len(encoded))) + out = append(out, lenBuf[:]...) + out = append(out, encoded...) + } + return out, nil +} + +func (s *KeyedListState[V]) deserializeValuesVarLen(raw []byte) ([]V, error) { + out := make([]V, 0, 16) + idx := 0 + for idx < len(raw) { + if len(raw)-idx < 4 { + return nil, api.NewError(api.ErrResultUnexpected, "corrupted keyed list payload: truncated length") + } + + itemLen := int(binary.BigEndian.Uint32(raw[idx : idx+4])) + idx += 4 + + if itemLen < 0 || len(raw)-idx < itemLen { + return nil, api.NewError(api.ErrResultUnexpected, "corrupted keyed list payload: invalid element length") + } + + itemRaw := raw[idx : idx+itemLen] + idx += itemLen + + value, err := s.valueCodec.Decode(itemRaw) + if err != nil { + return nil, fmt.Errorf("decode keyed list value failed: %w", err) + } + out = append(out, value) + } + return out, nil +} + +func (s *KeyedListState[V]) serializeValueFixed(value V) ([]byte, error) { + if s.fixedSize <= 0 { + return nil, api.NewError(api.ErrResultUnexpected, "fixed-size codec must report positive size") + } + encoded, err := s.valueCodec.Encode(value) + if err != nil { + return nil, fmt.Errorf("encode keyed list value failed: %w", err) + } + if len(encoded) != s.fixedSize { + return nil, api.NewError(api.ErrResultUnexpected, "fixed-size codec encoded unexpected length: got %d, want %d", len(encoded), s.fixedSize) + } + out := make([]byte, 0, s.fixedSize) + out = append(out, encoded...) + return out, nil +} + +func (s *KeyedListState[V]) serializeValuesFixedBatch(values []V) ([]byte, error) { + if s.fixedSize <= 0 { + return nil, api.NewError(api.ErrResultUnexpected, "fixed-size codec must report positive size") + } + total := s.fixedSize * len(values) + out := make([]byte, 0, total) + for _, value := range values { + encoded, err := s.valueCodec.Encode(value) + if err != nil { + return nil, fmt.Errorf("encode keyed list value failed: %w", err) + } + if len(encoded) != s.fixedSize { + return nil, api.NewError(api.ErrResultUnexpected, "fixed-size codec encoded unexpected length: got %d, want %d", len(encoded), s.fixedSize) + } + out = append(out, encoded...) + } + return out, nil +} + +func (s *KeyedListState[V]) deserializeValuesFixed(raw []byte) ([]V, error) { + if s.fixedSize <= 0 { + return nil, api.NewError(api.ErrResultUnexpected, "fixed-size codec must report positive size") + } + + if len(raw)%s.fixedSize != 0 { + return nil, api.NewError(api.ErrResultUnexpected, "corrupted keyed list payload: fixed-size data length mismatch") + } + + count := len(raw) / s.fixedSize + out := make([]V, 0, count) + + for idx := 0; idx < len(raw); idx += s.fixedSize { + itemRaw := raw[idx : idx+s.fixedSize] + value, err := s.valueCodec.Decode(itemRaw) + if err != nil { + return nil, fmt.Errorf("decode keyed list value failed: %w", err) + } + out = append(out, value) + } + return out, nil +} diff --git a/go-sdk-advanced/keyed/keyed_map_state.go b/go-sdk-advanced/keyed/keyed_map_state.go new file mode 100644 index 00000000..539fe70f --- /dev/null +++ b/go-sdk-advanced/keyed/keyed_map_state.go @@ -0,0 +1,198 @@ +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package keyed + +import ( + "fmt" + "iter" + + "github.com/functionstream/function-stream/go-sdk-advanced/codec" + "github.com/functionstream/function-stream/go-sdk/api" + "github.com/functionstream/function-stream/go-sdk/state/common" +) + +type KeyedMapStateFactory[MK any, MV any] struct { + store common.Store + groupKey []byte + mapKeyCodec codec.Codec[MK] + mapValueCodec codec.Codec[MV] +} + +// NewKeyedMapStateFactoryFromContext creates a KeyedMapStateFactory using the store from ctx.GetOrCreateStore(storeName). +func NewKeyedMapStateFactoryFromContext[MK any, MV any](ctx api.Context, storeName string, keyGroup []byte, keyCodec codec.Codec[MK], valueCodec codec.Codec[MV]) (*KeyedMapStateFactory[MK, MV], error) { + store, err := ctx.GetOrCreateStore(storeName) + if err != nil { + return nil, err + } + return newKeyedMapStateFactory(store, keyGroup, keyCodec, valueCodec) +} + +// NewKeyedMapStateFactoryFromContextAutoCodec creates a KeyedMapStateFactory with default map-key and map-value codecs. MK must have an ordered default codec. +func NewKeyedMapStateFactoryFromContextAutoCodec[MK any, MV any](ctx api.Context, storeName string, keyGroup []byte) (*KeyedMapStateFactory[MK, MV], error) { + store, err := ctx.GetOrCreateStore(storeName) + if err != nil { + return nil, err + } + mapKeyCodec, err := codec.DefaultCodecFor[MK]() + if err != nil { + return nil, err + } + mapValueCodec, err := codec.DefaultCodecFor[MV]() + if err != nil { + return nil, err + } + return newKeyedMapStateFactory(store, keyGroup, mapKeyCodec, mapValueCodec) +} + +func newKeyedMapStateFactory[MK any, MV any]( + store common.Store, + keyGroup []byte, + mapKeyCodec codec.Codec[MK], + mapValueCodec codec.Codec[MV], +) (*KeyedMapStateFactory[MK, MV], error) { + + if store == nil { + return nil, api.NewError(api.ErrStoreInternal, "keyed map state factory store must not be nil") + } + if keyGroup == nil { + return nil, api.NewError(api.ErrStoreInternal, "keyed map state factory key_group must not be nil") + } + if mapKeyCodec == nil || mapValueCodec == nil { + return nil, api.NewError(api.ErrStoreInternal, "keyed map state factory map_key_codec and map_value_codec must not be nil") + } + + if !mapKeyCodec.IsOrderedKeyCodec() { + return nil, api.NewError(api.ErrStoreInternal, "map key codec must be ordered") + } + + return &KeyedMapStateFactory[MK, MV]{ + store: store, + groupKey: common.DupBytes(keyGroup), + mapKeyCodec: mapKeyCodec, + mapValueCodec: mapValueCodec, + }, nil +} + +type KeyedMapState[MK any, MV any] struct { + factory *KeyedMapStateFactory[MK, MV] + primaryKey []byte + namespace []byte +} + +func (f *KeyedMapStateFactory[MK, MV]) NewKeyedMap(primaryKey []byte, mapName string) (*KeyedMapState[MK, MV], error) { + if primaryKey == nil || mapName == "" { + return nil, api.NewError(api.ErrStoreInternal, "primary key and map name are required") + } + return &KeyedMapState[MK, MV]{ + factory: f, + primaryKey: common.DupBytes(primaryKey), + namespace: []byte(mapName), + }, nil +} + +func (s *KeyedMapState[MK, MV]) buildCK(mapKey MK) (api.ComplexKey, error) { + encodedMapKey, err := s.factory.mapKeyCodec.Encode(mapKey) + if err != nil { + return api.ComplexKey{}, fmt.Errorf("encode map userKey failed: %w", err) + } + return api.ComplexKey{ + KeyGroup: s.factory.groupKey, + Key: s.primaryKey, + Namespace: s.namespace, + UserKey: encodedMapKey, + }, nil +} + +func (s *KeyedMapState[MK, MV]) Put(mapKey MK, value MV) error { + ck, err := s.buildCK(mapKey) + if err != nil { + return err + } + encodedValue, err := s.factory.mapValueCodec.Encode(value) + if err != nil { + return err + } + return s.factory.store.Put(ck, encodedValue) +} + +func (s *KeyedMapState[MK, MV]) Get(mapKey MK) (MV, bool, error) { + var zero MV + ck, err := s.buildCK(mapKey) + if err != nil { + return zero, false, err + } + raw, found, err := s.factory.store.Get(ck) + if err != nil || !found { + return zero, found, err + } + decoded, err := s.factory.mapValueCodec.Decode(raw) + if err != nil { + return zero, false, err + } + return decoded, true, nil +} + +func (s *KeyedMapState[MK, MV]) Delete(mapKey MK) error { + ck, err := s.buildCK(mapKey) + if err != nil { + return err + } + return s.factory.store.Delete(ck) +} + +func (s *KeyedMapState[MK, MV]) Clear() error { + return s.factory.store.DeletePrefix(api.ComplexKey{ + KeyGroup: s.factory.groupKey, + Key: s.primaryKey, + Namespace: s.namespace, + UserKey: []byte{}, + }) +} + +func (s *KeyedMapState[MK, MV]) All() iter.Seq2[MK, MV] { + return func(yield func(MK, MV) bool) { + iter, err := s.factory.store.ScanComplex( + s.factory.groupKey, + s.primaryKey, + s.namespace, + ) + if err != nil { + return + } + defer iter.Close() + + for { + has, err := iter.HasNext() + if err != nil || !has { + return + } + keyRaw, valRaw, ok, err := iter.Next() + if err != nil || !ok { + return + } + + k, err := s.factory.mapKeyCodec.Decode(keyRaw) + if err != nil { + continue // skip entry on decode error to avoid yielding corrupted zero values + } + v, err := s.factory.mapValueCodec.Decode(valRaw) + if err != nil { + continue + } + + if !yield(k, v) { + return + } + } + } +} diff --git a/go-sdk-advanced/keyed/keyed_priority_queue_state.go b/go-sdk-advanced/keyed/keyed_priority_queue_state.go new file mode 100644 index 00000000..0ee46575 --- /dev/null +++ b/go-sdk-advanced/keyed/keyed_priority_queue_state.go @@ -0,0 +1,203 @@ +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package keyed + +import ( + "fmt" + "iter" + + "github.com/functionstream/function-stream/go-sdk-advanced/codec" + "github.com/functionstream/function-stream/go-sdk/api" + "github.com/functionstream/function-stream/go-sdk/state/common" +) + +type KeyedPriorityQueueStateFactory[V any] struct { + store common.Store + groupKey []byte + valueCodec codec.Codec[V] +} + +// NewKeyedPriorityQueueStateFactoryFromContext creates a KeyedPriorityQueueStateFactory using the store from ctx.GetOrCreateStore(storeName). +func NewKeyedPriorityQueueStateFactoryFromContext[V any](ctx api.Context, storeName string, keyGroup []byte, itemCodec codec.Codec[V]) (*KeyedPriorityQueueStateFactory[V], error) { + store, err := ctx.GetOrCreateStore(storeName) + if err != nil { + return nil, err + } + return newKeyedPriorityQueueStateFactory(store, keyGroup, itemCodec) +} + +// NewKeyedPriorityQueueStateFactoryFromContextAutoCodec creates a KeyedPriorityQueueStateFactory with default value codec. V must have an ordered default codec. +func NewKeyedPriorityQueueStateFactoryFromContextAutoCodec[V any](ctx api.Context, storeName string, keyGroup []byte) (*KeyedPriorityQueueStateFactory[V], error) { + store, err := ctx.GetOrCreateStore(storeName) + if err != nil { + return nil, err + } + valueCodec, err := codec.DefaultCodecFor[V]() + if err != nil { + return nil, err + } + return newKeyedPriorityQueueStateFactory(store, keyGroup, valueCodec) +} + +func newKeyedPriorityQueueStateFactory[V any]( + store common.Store, + keyGroup []byte, + valueCodec codec.Codec[V], +) (*KeyedPriorityQueueStateFactory[V], error) { + + if store == nil { + return nil, api.NewError(api.ErrStoreInternal, "keyed priority queue state factory store must not be nil") + } + if keyGroup == nil { + return nil, api.NewError(api.ErrStoreInternal, "keyed priority queue state factory key_group must not be nil") + } + if valueCodec == nil { + return nil, api.NewError(api.ErrStoreInternal, "keyed priority queue state factory value codec must not be nil") + } + + if !valueCodec.IsOrderedKeyCodec() { + return nil, api.NewError(api.ErrStoreInternal, "priority queue value codec must be ordered") + } + + return &KeyedPriorityQueueStateFactory[V]{ + store: store, + groupKey: common.DupBytes(keyGroup), + valueCodec: valueCodec, + }, nil +} + +type KeyedPriorityQueueState[V any] struct { + factory *KeyedPriorityQueueStateFactory[V] + primaryKey []byte + namespace []byte +} + +func (f *KeyedPriorityQueueStateFactory[V]) NewKeyedPriorityQueue(primaryKey []byte, namespace []byte) (*KeyedPriorityQueueState[V], error) { + if primaryKey == nil || namespace == nil { + return nil, api.NewError(api.ErrStoreInternal, "primary key and queue name are required") + } + return &KeyedPriorityQueueState[V]{ + factory: f, + primaryKey: common.DupBytes(primaryKey), + namespace: common.DupBytes(namespace), + }, nil +} + +func (s *KeyedPriorityQueueState[V]) Add(value V) error { + userKey, err := s.factory.valueCodec.Encode(value) + if err != nil { + return fmt.Errorf("encode pq element failed: %w", err) + } + + ck := api.ComplexKey{ + KeyGroup: s.factory.groupKey, + Key: s.primaryKey, + Namespace: s.namespace, + UserKey: userKey, + } + + return s.factory.store.Put(ck, []byte{}) +} + +func (s *KeyedPriorityQueueState[V]) Peek() (V, bool, error) { + var zero V + + iter, err := s.factory.store.ScanComplex( + s.factory.groupKey, + s.primaryKey, + s.namespace, + ) + if err != nil { + return zero, false, err + } + defer iter.Close() + + has, err := iter.HasNext() + if err != nil || !has { + return zero, false, err + } + + userKey, _, ok, err := iter.Next() + if err != nil || !ok { + return zero, false, err + } + + val, err := s.factory.valueCodec.Decode(userKey) + if err != nil { + return zero, false, err + } + return val, true, nil +} + +func (s *KeyedPriorityQueueState[V]) Poll() (V, bool, error) { + val, found, err := s.Peek() + if err != nil || !found { + return val, found, err + } + + userKey, err := s.factory.valueCodec.Encode(val) + if err != nil { + return val, true, fmt.Errorf("encode pq element for delete failed: %w", err) + } + ck := api.ComplexKey{ + KeyGroup: s.factory.groupKey, + Key: s.primaryKey, + Namespace: s.namespace, + UserKey: userKey, + } + + err = s.factory.store.Delete(ck) + return val, true, err +} + +func (s *KeyedPriorityQueueState[V]) Clear() error { + return s.factory.store.DeletePrefix(api.ComplexKey{ + KeyGroup: s.factory.groupKey, + Key: s.primaryKey, + Namespace: s.namespace, + UserKey: []byte{}, + }) +} + +func (s *KeyedPriorityQueueState[V]) All() iter.Seq[V] { + return func(yield func(V) bool) { + iter, err := s.factory.store.ScanComplex( + s.factory.groupKey, + s.primaryKey, + s.namespace, + ) + if err != nil { + return + } + defer iter.Close() + + for { + has, err := iter.HasNext() + if err != nil || !has { + return + } + userKey, _, ok, err := iter.Next() + if err != nil || !ok { + return + } + + v, err := s.factory.valueCodec.Decode(userKey) + if err != nil { + continue // skip entry on decode error to avoid yielding corrupted zero values + } + if !yield(v) { + return + } + } + } +} diff --git a/go-sdk-advanced/keyed/keyed_reducing_state.go b/go-sdk-advanced/keyed/keyed_reducing_state.go new file mode 100644 index 00000000..dd255158 --- /dev/null +++ b/go-sdk-advanced/keyed/keyed_reducing_state.go @@ -0,0 +1,151 @@ +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package keyed + +import ( + "fmt" + + "github.com/functionstream/function-stream/go-sdk-advanced/codec" + "github.com/functionstream/function-stream/go-sdk/api" + "github.com/functionstream/function-stream/go-sdk/state/common" +) + +type ReduceFunc[V any] func(value1 V, value2 V) (V, error) + +type KeyedReducingStateFactory[V any] struct { + store common.Store + groupKey []byte + valueCodec codec.Codec[V] + reduceFunc ReduceFunc[V] +} + +// NewKeyedReducingStateFactoryFromContext creates a KeyedReducingStateFactory using the store from ctx.GetOrCreateStore(storeName). +func NewKeyedReducingStateFactoryFromContext[V any](ctx api.Context, storeName string, keyGroup []byte, valueCodec codec.Codec[V], reduceFunc ReduceFunc[V]) (*KeyedReducingStateFactory[V], error) { + store, err := ctx.GetOrCreateStore(storeName) + if err != nil { + return nil, err + } + return newKeyedReducingStateFactory(store, keyGroup, valueCodec, reduceFunc) +} + +// NewKeyedReducingStateFactoryFromContextAutoCodec creates a KeyedReducingStateFactory with default value codec from ctx.GetOrCreateStore(storeName). +func NewKeyedReducingStateFactoryFromContextAutoCodec[V any](ctx api.Context, storeName string, keyGroup []byte, reduceFunc ReduceFunc[V]) (*KeyedReducingStateFactory[V], error) { + store, err := ctx.GetOrCreateStore(storeName) + if err != nil { + return nil, err + } + valueCodec, err := codec.DefaultCodecFor[V]() + if err != nil { + return nil, err + } + return newKeyedReducingStateFactory(store, keyGroup, valueCodec, reduceFunc) +} + +func newKeyedReducingStateFactory[V any]( + store common.Store, + keyGroup []byte, + valueCodec codec.Codec[V], + reduceFunc ReduceFunc[V], +) (*KeyedReducingStateFactory[V], error) { + + if store == nil { + return nil, api.NewError(api.ErrStoreInternal, "keyed reducing state factory store must not be nil") + } + if keyGroup == nil { + return nil, api.NewError(api.ErrStoreInternal, "keyed reducing state factory key_group must not be nil") + } + if valueCodec == nil || reduceFunc == nil { + return nil, api.NewError(api.ErrStoreInternal, "keyed reducing state factory value_codec and reduce_func must not be nil") + } + + return &KeyedReducingStateFactory[V]{ + store: store, + groupKey: common.DupBytes(keyGroup), + valueCodec: valueCodec, + reduceFunc: reduceFunc, + }, nil +} + +func (f *KeyedReducingStateFactory[V]) NewReducingState(primaryKey []byte, namespace []byte) (*KeyedReducingState[V], error) { + if primaryKey == nil || namespace == nil { + return nil, api.NewError(api.ErrStoreInternal, "primary key and state name are required") + } + return &KeyedReducingState[V]{ + factory: f, + primaryKey: common.DupBytes(primaryKey), + namespace: common.DupBytes(namespace), + }, nil +} + +type KeyedReducingState[V any] struct { + factory *KeyedReducingStateFactory[V] + primaryKey []byte + namespace []byte +} + +func (s *KeyedReducingState[V]) buildCK() api.ComplexKey { + return api.ComplexKey{ + KeyGroup: s.factory.groupKey, + Key: s.primaryKey, + Namespace: s.namespace, + UserKey: []byte{}, + } +} + +func (s *KeyedReducingState[V]) Add(value V) error { + ck := s.buildCK() + raw, found, err := s.factory.store.Get(ck) + if err != nil { + return fmt.Errorf("failed to get old value for reducing state: %w", err) + } + + var result V + if !found { + result = value + } else { + oldValue, err := s.factory.valueCodec.Decode(raw) + if err != nil { + return fmt.Errorf("failed to decode old value: %w", err) + } + + result, err = s.factory.reduceFunc(oldValue, value) + if err != nil { + return fmt.Errorf("error in user reduce function: %w", err) + } + } + + encoded, err := s.factory.valueCodec.Encode(result) + if err != nil { + return fmt.Errorf("failed to encode reduced value: %w", err) + } + + return s.factory.store.Put(ck, encoded) +} + +func (s *KeyedReducingState[V]) Get() (V, bool, error) { + var zero V + ck := s.buildCK() + raw, found, err := s.factory.store.Get(ck) + if err != nil || !found { + return zero, found, err + } + val, err := s.factory.valueCodec.Decode(raw) + if err != nil { + return zero, false, fmt.Errorf("failed to decode value: %w", err) + } + return val, true, nil +} + +func (s *KeyedReducingState[V]) Clear() error { + return s.factory.store.Delete(s.buildCK()) +} diff --git a/go-sdk-advanced/keyed/keyed_value_state.go b/go-sdk-advanced/keyed/keyed_value_state.go new file mode 100644 index 00000000..a2e6d236 --- /dev/null +++ b/go-sdk-advanced/keyed/keyed_value_state.go @@ -0,0 +1,128 @@ +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +// Package keyed provides keyed state types (KeyedValueState, KeyedListState, etc.) +// for the Advanced SDK. This library depends on go-sdk (low-level). +package keyed + +import ( + "fmt" + + "github.com/functionstream/function-stream/go-sdk-advanced/codec" + "github.com/functionstream/function-stream/go-sdk/api" + "github.com/functionstream/function-stream/go-sdk/state/common" +) + +type KeyedValueStateFactory[V any] struct { + store common.Store + groupKey []byte + valueCodec codec.Codec[V] +} + +// NewKeyedValueStateFactoryFromContext creates a KeyedValueStateFactory using the store from ctx.GetOrCreateStore(storeName). +func NewKeyedValueStateFactoryFromContext[V any](ctx api.Context, storeName string, keyGroup []byte, valueCodec codec.Codec[V]) (*KeyedValueStateFactory[V], error) { + store, err := ctx.GetOrCreateStore(storeName) + if err != nil { + return nil, err + } + return newKeyedValueStateFactory(store, keyGroup, valueCodec) +} + +// NewKeyedValueStateFactoryFromContextAutoCodec creates a KeyedValueStateFactory with default value codec from ctx.GetOrCreateStore(storeName). +func NewKeyedValueStateFactoryFromContextAutoCodec[V any](ctx api.Context, storeName string, keyGroup []byte) (*KeyedValueStateFactory[V], error) { + store, err := ctx.GetOrCreateStore(storeName) + if err != nil { + return nil, err + } + valueCodec, err := codec.DefaultCodecFor[V]() + if err != nil { + return nil, err + } + return newKeyedValueStateFactory(store, keyGroup, valueCodec) +} + +func newKeyedValueStateFactory[V any]( + store common.Store, + keyGroup []byte, + valueCodec codec.Codec[V], +) (*KeyedValueStateFactory[V], error) { + + if store == nil { + return nil, api.NewError(api.ErrStoreInternal, "keyed value state factory store must not be nil") + } + if keyGroup == nil { + return nil, api.NewError(api.ErrStoreInternal, "keyed value state factory key_group must not be nil") + } + if valueCodec == nil { + return nil, api.NewError(api.ErrStoreInternal, "keyed value state factory value codec must not be nil") + } + + return &KeyedValueStateFactory[V]{ + store: store, + groupKey: common.DupBytes(keyGroup), + valueCodec: valueCodec, + }, nil +} + +// NewKeyedValue creates a KeyedValueState for the given primary key and namespace. +func (f *KeyedValueStateFactory[V]) NewKeyedValue(primaryKey []byte, namespace []byte) (*KeyedValueState[V], error) { + if primaryKey == nil || namespace == nil { + return nil, api.NewError(api.ErrStoreInternal, "primary key and namespace are required") + } + return &KeyedValueState[V]{ + factory: f, + primaryKey: common.DupBytes(primaryKey), + namespace: common.DupBytes(namespace), + }, nil +} + +type KeyedValueState[V any] struct { + factory *KeyedValueStateFactory[V] + primaryKey []byte + namespace []byte +} + +func (s *KeyedValueState[V]) buildCK() api.ComplexKey { + return api.ComplexKey{ + KeyGroup: s.factory.groupKey, + Key: s.primaryKey, + Namespace: s.namespace, + UserKey: []byte{}, + } +} + +func (s *KeyedValueState[V]) Update(value V) error { + ck := s.buildCK() + encoded, err := s.factory.valueCodec.Encode(value) + if err != nil { + return fmt.Errorf("encode value state failed: %w", err) + } + return s.factory.store.Put(ck, encoded) +} + +func (s *KeyedValueState[V]) Value() (V, bool, error) { + var zero V + ck := s.buildCK() + raw, found, err := s.factory.store.Get(ck) + if err != nil || !found { + return zero, found, err + } + decoded, err := s.factory.valueCodec.Decode(raw) + if err != nil { + return zero, false, fmt.Errorf("decode value state failed: %w", err) + } + return decoded, true, nil +} + +func (s *KeyedValueState[V]) Clear() error { + return s.factory.store.Delete(s.buildCK()) +} diff --git a/go-sdk-advanced/structures/aggregating.go b/go-sdk-advanced/structures/aggregating.go new file mode 100644 index 00000000..7b8b4a79 --- /dev/null +++ b/go-sdk-advanced/structures/aggregating.go @@ -0,0 +1,146 @@ +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package structures + +import ( + "fmt" + + "github.com/functionstream/function-stream/go-sdk-advanced/codec" + "github.com/functionstream/function-stream/go-sdk/api" + "github.com/functionstream/function-stream/go-sdk/state/common" +) + +type AggregateFunc[T any, ACC any, R any] interface { + CreateAccumulator() ACC + Add(value T, accumulator ACC) ACC + GetResult(accumulator ACC) R + Merge(a ACC, b ACC) ACC +} + +type AggregatingState[T any, ACC any, R any] struct { + store common.Store + complexKey api.ComplexKey + accCodec codec.Codec[ACC] + aggFunc AggregateFunc[T, ACC, R] +} + +// NewAggregatingStateFromContext creates an AggregatingState using the store from ctx.GetOrCreateStore(storeName). +func NewAggregatingStateFromContext[T any, ACC any, R any]( + ctx api.Context, + storeName string, + accCodec codec.Codec[ACC], + aggFunc AggregateFunc[T, ACC, R], +) (*AggregatingState[T, ACC, R], error) { + store, err := ctx.GetOrCreateStore(storeName) + if err != nil { + return nil, err + } + return newAggregatingState(store, accCodec, aggFunc) +} + +// NewAggregatingStateFromContextAutoCodec creates an AggregatingState with default accumulator codec from ctx.GetOrCreateStore(storeName). +func NewAggregatingStateFromContextAutoCodec[T any, ACC any, R any]( + ctx api.Context, + storeName string, + aggFunc AggregateFunc[T, ACC, R], +) (*AggregatingState[T, ACC, R], error) { + store, err := ctx.GetOrCreateStore(storeName) + if err != nil { + return nil, err + } + accCodec, err := codec.DefaultCodecFor[ACC]() + if err != nil { + return nil, err + } + return newAggregatingState(store, accCodec, aggFunc) +} + +func newAggregatingState[T any, ACC any, R any]( + store common.Store, + accCodec codec.Codec[ACC], + aggFunc AggregateFunc[T, ACC, R], +) (*AggregatingState[T, ACC, R], error) { + if store == nil { + return nil, api.NewError(api.ErrStoreInternal, "aggregating state store must not be nil") + } + if accCodec == nil { + return nil, api.NewError(api.ErrStoreInternal, "aggregating state acc codec must not be nil") + } + if aggFunc == nil { + return nil, api.NewError(api.ErrStoreInternal, "aggregating state agg func must not be nil") + } + ck := api.ComplexKey{ + KeyGroup: []byte{}, + Key: []byte{}, + Namespace: []byte{}, + UserKey: []byte{}, + } + return &AggregatingState[T, ACC, R]{ + store: store, + complexKey: ck, + accCodec: accCodec, + aggFunc: aggFunc, + }, nil +} + +func (s *AggregatingState[T, ACC, R]) buildCK() api.ComplexKey { + return s.complexKey +} + +func (s *AggregatingState[T, ACC, R]) Add(value T) error { + ck := s.buildCK() + + raw, found, err := s.store.Get(ck) + if err != nil { + return fmt.Errorf("failed to get accumulator: %w", err) + } + + var acc ACC + if !found { + acc = s.aggFunc.CreateAccumulator() + } else { + var err error + acc, err = s.accCodec.Decode(raw) + if err != nil { + return fmt.Errorf("failed to decode accumulator: %w", err) + } + } + + newAcc := s.aggFunc.Add(value, acc) + + encoded, err := s.accCodec.Encode(newAcc) + if err != nil { + return fmt.Errorf("failed to encode new accumulator: %w", err) + } + return s.store.Put(ck, encoded) +} + +func (s *AggregatingState[T, ACC, R]) Get() (R, bool, error) { + var zero R + ck := s.buildCK() + raw, found, err := s.store.Get(ck) + if err != nil || !found { + return zero, found, err + } + + acc, err := s.accCodec.Decode(raw) + if err != nil { + return zero, false, err + } + + return s.aggFunc.GetResult(acc), true, nil +} + +func (s *AggregatingState[T, ACC, R]) Clear() error { + return s.store.Delete(s.buildCK()) +} diff --git a/go-sdk-advanced/structures/list.go b/go-sdk-advanced/structures/list.go new file mode 100644 index 00000000..53907c11 --- /dev/null +++ b/go-sdk-advanced/structures/list.go @@ -0,0 +1,241 @@ +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +// Package structures provides high-level state types (ListState, MapState, etc.) +// for the Advanced SDK. This library depends on go-sdk (low-level). +package structures + +import ( + "encoding/binary" + "fmt" + + "github.com/functionstream/function-stream/go-sdk-advanced/codec" + "github.com/functionstream/function-stream/go-sdk/api" + "github.com/functionstream/function-stream/go-sdk/state/common" +) + +type ListState[T any] struct { + store common.Store + complexKey api.ComplexKey + codec codec.Codec[T] + fixedSize int + serialize func(T) ([]byte, error) + decode func([]byte) ([]T, error) + serializeBatch func([]T) ([]byte, error) +} + +// NewListStateFromContext creates a ListState using the store from ctx.GetOrCreateStore(storeName). +func NewListStateFromContext[T any](ctx api.Context, storeName string, itemCodec codec.Codec[T]) (*ListState[T], error) { + store, err := ctx.GetOrCreateStore(storeName) + if err != nil { + return nil, err + } + return newListState(store, itemCodec) +} + +// NewListStateFromContextAutoCodec creates a ListState with default codec for T from ctx.GetOrCreateStore(storeName). +func NewListStateFromContextAutoCodec[T any](ctx api.Context, storeName string) (*ListState[T], error) { + store, err := ctx.GetOrCreateStore(storeName) + if err != nil { + return nil, err + } + itemCodec, err := codec.DefaultCodecFor[T]() + if err != nil { + return nil, err + } + return newListState(store, itemCodec) +} + +func newListState[T any](store common.Store, itemCodec codec.Codec[T]) (*ListState[T], error) { + if store == nil { + return nil, api.NewError(api.ErrStoreInternal, "list state store must not be nil") + } + if itemCodec == nil { + return nil, api.NewError(api.ErrStoreInternal, "list state codec must not be nil") + } + fixedSize, isFixed := codec.FixedEncodedSize[T](itemCodec) + l := &ListState[T]{ + store: store, + complexKey: api.ComplexKey{ + KeyGroup: []byte{}, + Key: []byte{}, + Namespace: []byte{}, + UserKey: []byte{}, + }, + codec: itemCodec, + fixedSize: fixedSize, + } + if isFixed { + l.serialize = l.serializeValueFixed + l.serializeBatch = l.serializeValuesFixedBatch + l.decode = l.deserializeValuesFixed + } else { + l.serialize = l.serializeValueVarLen + l.serializeBatch = l.serializeValuesVarLenBatch + l.decode = l.deserializeValuesVarLen + } + return l, nil +} + +func (l *ListState[T]) Add(value T) error { + payload, err := l.serialize(value) + if err != nil { + return err + } + return l.store.Merge(l.complexKey, payload) +} + +func (l *ListState[T]) AddAll(values []T) error { + payload, err := l.serializeBatch(values) + if err != nil { + return err + } + if err := l.store.Merge(l.complexKey, payload); err != nil { + return err + } + return nil +} + +func (l *ListState[T]) Get() ([]T, error) { + raw, found, err := l.store.Get(l.complexKey) + if err != nil { + return nil, err + } + if !found { + return []T{}, nil + } + return l.decode(raw) +} + +func (l *ListState[T]) Update(values []T) error { + if len(values) == 0 { + return l.Clear() + } + payload, err := l.serializeBatch(values) + if err != nil { + return err + } + return l.store.Put(l.complexKey, payload) +} + +func (l *ListState[T]) Clear() error { + return l.store.Delete(l.complexKey) +} + +func (l *ListState[T]) serializeValueVarLen(value T) ([]byte, error) { + encoded, err := l.codec.Encode(value) + if err != nil { + return nil, fmt.Errorf("encode list value failed: %w", err) + } + out := make([]byte, 4, 4+len(encoded)) + binary.BigEndian.PutUint32(out, uint32(len(encoded))) + out = append(out, encoded...) + return out, nil +} + +func (l *ListState[T]) serializeValuesVarLenBatch(values []T) ([]byte, error) { + total := 0 + encodedValues := make([][]byte, 0, len(values)) + for _, value := range values { + encoded, err := l.codec.Encode(value) + if err != nil { + return nil, fmt.Errorf("encode list value failed: %w", err) + } + encodedValues = append(encodedValues, encoded) + total += 4 + len(encoded) + } + out := make([]byte, 0, total) + for _, encoded := range encodedValues { + var lenBuf [4]byte + binary.BigEndian.PutUint32(lenBuf[:], uint32(len(encoded))) + out = append(out, lenBuf[:]...) + out = append(out, encoded...) + } + return out, nil +} + +func (l *ListState[T]) deserializeValuesVarLen(raw []byte) ([]T, error) { + out := make([]T, 0, 16) + idx := 0 + for idx < len(raw) { + if len(raw)-idx < 4 { + return nil, api.NewError(api.ErrResultUnexpected, "corrupted list payload: truncated length") + } + itemLen := int(binary.BigEndian.Uint32(raw[idx : idx+4])) + idx += 4 + if itemLen < 0 || len(raw)-idx < itemLen { + return nil, api.NewError(api.ErrResultUnexpected, "corrupted list payload: invalid element length") + } + itemRaw := raw[idx : idx+itemLen] + idx += itemLen + value, err := l.codec.Decode(itemRaw) + if err != nil { + return nil, fmt.Errorf("decode list value failed: %w", err) + } + out = append(out, value) + } + return out, nil +} + +func (l *ListState[T]) serializeValueFixed(value T) ([]byte, error) { + if l.fixedSize <= 0 { + return nil, api.NewError(api.ErrResultUnexpected, "fixed-size codec must report positive size") + } + encoded, err := l.codec.Encode(value) + if err != nil { + return nil, fmt.Errorf("encode list value failed: %w", err) + } + if len(encoded) != l.fixedSize { + return nil, api.NewError(api.ErrResultUnexpected, "fixed-size codec encoded unexpected length: got %d, want %d", len(encoded), l.fixedSize) + } + out := make([]byte, 0, l.fixedSize) + out = append(out, encoded...) + return out, nil +} + +func (l *ListState[T]) serializeValuesFixedBatch(values []T) ([]byte, error) { + if l.fixedSize <= 0 { + return nil, api.NewError(api.ErrResultUnexpected, "fixed-size codec must report positive size") + } + total := l.fixedSize * len(values) + out := make([]byte, 0, total) + for _, value := range values { + encoded, err := l.codec.Encode(value) + if err != nil { + return nil, fmt.Errorf("encode list value failed: %w", err) + } + if len(encoded) != l.fixedSize { + return nil, api.NewError(api.ErrResultUnexpected, "fixed-size codec encoded unexpected length: got %d, want %d", len(encoded), l.fixedSize) + } + out = append(out, encoded...) + } + return out, nil +} + +func (l *ListState[T]) deserializeValuesFixed(raw []byte) ([]T, error) { + if l.fixedSize <= 0 { + return nil, api.NewError(api.ErrResultUnexpected, "fixed-size codec must report positive size") + } + if len(raw)%l.fixedSize != 0 { + return nil, api.NewError(api.ErrResultUnexpected, "corrupted list payload: fixed-size data length mismatch") + } + out := make([]T, 0, len(raw)/l.fixedSize) + for idx := 0; idx < len(raw); idx += l.fixedSize { + itemRaw := raw[idx : idx+l.fixedSize] + value, err := l.codec.Decode(itemRaw) + if err != nil { + return nil, fmt.Errorf("decode list value failed: %w", err) + } + out = append(out, value) + } + return out, nil +} diff --git a/go-sdk-advanced/structures/map.go b/go-sdk-advanced/structures/map.go new file mode 100644 index 00000000..b90b051d --- /dev/null +++ b/go-sdk-advanced/structures/map.go @@ -0,0 +1,177 @@ +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package structures + +import ( + "fmt" + "iter" + + "github.com/functionstream/function-stream/go-sdk-advanced/codec" + "github.com/functionstream/function-stream/go-sdk/api" + "github.com/functionstream/function-stream/go-sdk/state/common" +) + +type MapEntry[K any, V any] struct { + Key K + Value V +} + +type MapState[K any, V any] struct { + store common.Store + keyGroup []byte + key []byte + namespace []byte + keyCodec codec.Codec[K] + valueCodec codec.Codec[V] +} + +// NewMapStateFromContext creates a MapState using the store from ctx.GetOrCreateStore(storeName). +func NewMapStateFromContext[K any, V any](ctx api.Context, storeName string, keyCodec codec.Codec[K], valueCodec codec.Codec[V]) (*MapState[K, V], error) { + store, err := ctx.GetOrCreateStore(storeName) + if err != nil { + return nil, err + } + return newMapState(store, keyCodec, valueCodec) +} + +// NewMapStateAutoKeyCodecFromContext creates a MapState with default key codec using the store from context. +func NewMapStateAutoKeyCodecFromContext[K any, V any](ctx api.Context, storeName string, valueCodec codec.Codec[V]) (*MapState[K, V], error) { + store, err := ctx.GetOrCreateStore(storeName) + if err != nil { + return nil, err + } + return newMapStateAutoKeyCodec[K, V](store, valueCodec) +} + +// NewMapStateFromContextAutoCodec creates a MapState with default key and value codecs from ctx.GetOrCreateStore(storeName). Key type K must have an ordered default codec. +func NewMapStateFromContextAutoCodec[K any, V any](ctx api.Context, storeName string) (*MapState[K, V], error) { + store, err := ctx.GetOrCreateStore(storeName) + if err != nil { + return nil, err + } + keyCodec, err := codec.DefaultCodecFor[K]() + if err != nil { + return nil, err + } + valueCodec, err := codec.DefaultCodecFor[V]() + if err != nil { + return nil, err + } + return newMapState(store, keyCodec, valueCodec) +} + +func newMapState[K any, V any](store common.Store, keyCodec codec.Codec[K], valueCodec codec.Codec[V]) (*MapState[K, V], error) { + if store == nil { + return nil, api.NewError(api.ErrStoreInternal, "map state store must not be nil") + } + if keyCodec == nil { + return nil, api.NewError(api.ErrStoreInternal, "map state key codec must not be nil") + } + if valueCodec == nil { + return nil, api.NewError(api.ErrStoreInternal, "map state value codec must not be nil") + } + if !keyCodec.IsOrderedKeyCodec() { + return nil, api.NewError(api.ErrStoreInternal, "map state key codec must be ordered (IsOrderedKeyCodec)") + } + return &MapState[K, V]{store: store, keyGroup: []byte{}, key: []byte{}, namespace: []byte{}, keyCodec: keyCodec, valueCodec: valueCodec}, nil +} + +func newMapStateAutoKeyCodec[K any, V any](store common.Store, valueCodec codec.Codec[V]) (*MapState[K, V], error) { + autoKeyCodec, err := codec.DefaultCodecFor[K]() + if err != nil { + return nil, err + } + return newMapState[K, V](store, autoKeyCodec, valueCodec) +} + +func (m *MapState[K, V]) Put(key K, value V) error { + encodedKey, err := m.keyCodec.Encode(key) + if err != nil { + return fmt.Errorf("encode map key failed: %w", err) + } + encodedValue, err := m.valueCodec.Encode(value) + if err != nil { + return fmt.Errorf("encode map value failed: %w", err) + } + return m.store.Put(m.ck(encodedKey), encodedValue) +} + +func (m *MapState[K, V]) Get(key K) (V, bool, error) { + var zero V + encodedKey, err := m.keyCodec.Encode(key) + if err != nil { + return zero, false, fmt.Errorf("encode map key failed: %w", err) + } + raw, found, err := m.store.Get(m.ck(encodedKey)) + if err != nil { + return zero, false, err + } + if !found { + return zero, false, nil + } + decoded, err := m.valueCodec.Decode(raw) + if err != nil { + return zero, false, fmt.Errorf("decode map value failed: %w", err) + } + return decoded, true, nil +} + +func (m *MapState[K, V]) Delete(key K) error { + encodedKey, err := m.keyCodec.Encode(key) + if err != nil { + return fmt.Errorf("encode map key failed: %w", err) + } + return m.store.Delete(m.ck(encodedKey)) +} + +func (m *MapState[K, V]) Clear() error { + return m.store.DeletePrefix(api.ComplexKey{KeyGroup: m.keyGroup, Key: m.key, Namespace: m.namespace, UserKey: nil}) +} + +func (m *MapState[K, V]) All() iter.Seq2[K, V] { + return func(yield func(K, V) bool) { + it, err := m.store.ScanComplex(m.keyGroup, m.key, m.namespace) + if err != nil { + return + } + defer it.Close() + + for { + has, err := it.HasNext() + if err != nil || !has { + return + } + keyRaw, valRaw, ok, err := it.Next() + if err != nil || !ok { + return + } + + k, err := m.keyCodec.Decode(keyRaw) + if err != nil { + continue // skip entry on decode error to avoid yielding corrupted zero values + } + v, err := m.valueCodec.Decode(valRaw) + if err != nil { + continue + } + + if !yield(k, v) { + return + } + } + } +} + +func (m *MapState[K, V]) ck(userKey []byte) api.ComplexKey { + return api.ComplexKey{KeyGroup: m.keyGroup, Key: m.key, Namespace: m.namespace, UserKey: userKey} +} diff --git a/go-sdk-advanced/structures/priority_queue.go b/go-sdk-advanced/structures/priority_queue.go new file mode 100644 index 00000000..85f5bbda --- /dev/null +++ b/go-sdk-advanced/structures/priority_queue.go @@ -0,0 +1,164 @@ +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package structures + +import ( + "fmt" + "iter" + + "github.com/functionstream/function-stream/go-sdk-advanced/codec" + "github.com/functionstream/function-stream/go-sdk/api" + "github.com/functionstream/function-stream/go-sdk/state/common" +) + +// PriorityQueueState holds a priority queue. itemCodec must be ordered (IsOrderedKeyCodec() true). +type PriorityQueueState[T any] struct { + store common.Store + keyGroup []byte + key []byte + namespace []byte + valueCodec codec.Codec[T] +} + +// NewPriorityQueueStateFromContext creates a PriorityQueueState using the store from ctx.GetOrCreateStore(storeName). +func NewPriorityQueueStateFromContext[T any](ctx api.Context, storeName string, itemCodec codec.Codec[T]) (*PriorityQueueState[T], error) { + store, err := ctx.GetOrCreateStore(storeName) + if err != nil { + return nil, err + } + return newPriorityQueueState(store, itemCodec) +} + +// NewPriorityQueueStateFromContextAutoCodec creates a PriorityQueueState with default codec for T. T must have an ordered default codec (e.g. primitive types). +func NewPriorityQueueStateFromContextAutoCodec[T any](ctx api.Context, storeName string) (*PriorityQueueState[T], error) { + store, err := ctx.GetOrCreateStore(storeName) + if err != nil { + return nil, err + } + itemCodec, err := codec.DefaultCodecFor[T]() + if err != nil { + return nil, err + } + return newPriorityQueueState(store, itemCodec) +} + +// newPriorityQueueState creates a priority queue state. itemCodec must support ordered key encoding. +func newPriorityQueueState[T any](store common.Store, itemCodec codec.Codec[T]) (*PriorityQueueState[T], error) { + if store == nil { + return nil, api.NewError(api.ErrStoreInternal, "priority queue state store must not be nil") + } + if itemCodec == nil { + return nil, api.NewError(api.ErrStoreInternal, "priority queue state codec must not be nil") + } + if !itemCodec.IsOrderedKeyCodec() { + return nil, api.NewError(api.ErrStoreInternal, "priority queue codec must support ordered key encoding") + } + return &PriorityQueueState[T]{ + store: store, + keyGroup: []byte{}, + key: []byte{}, + namespace: []byte{}, + valueCodec: itemCodec, + }, nil +} + +func (q *PriorityQueueState[T]) ck(userKey []byte) api.ComplexKey { + return api.ComplexKey{KeyGroup: q.keyGroup, Key: q.key, Namespace: q.namespace, UserKey: userKey} +} + +func (q *PriorityQueueState[T]) Add(value T) error { + userKey, err := q.valueCodec.Encode(value) + if err != nil { + return fmt.Errorf("encode pq element failed: %w", err) + } + return q.store.Put(q.ck(userKey), []byte{}) +} + +func (q *PriorityQueueState[T]) Peek() (T, bool, error) { + var zero T + it, err := q.store.ScanComplex(q.keyGroup, q.key, q.namespace) + if err != nil { + return zero, false, err + } + defer it.Close() + + has, err := it.HasNext() + if err != nil || !has { + return zero, false, err + } + + userKey, _, ok, err := it.Next() + if err != nil || !ok { + return zero, false, err + } + + val, err := q.valueCodec.Decode(userKey) + if err != nil { + return zero, false, err + } + return val, true, nil +} + +func (q *PriorityQueueState[T]) Poll() (T, bool, error) { + val, found, err := q.Peek() + if err != nil || !found { + return val, found, err + } + + userKey, err := q.valueCodec.Encode(val) + if err != nil { + return val, true, fmt.Errorf("encode pq element for delete failed: %w", err) + } + if err = q.store.Delete(q.ck(userKey)); err != nil { + return val, true, err + } + return val, true, nil +} + +func (q *PriorityQueueState[T]) Clear() error { + return q.store.DeletePrefix(api.ComplexKey{ + KeyGroup: q.keyGroup, + Key: q.key, + Namespace: q.namespace, + UserKey: nil, + }) +} + +func (q *PriorityQueueState[T]) All() iter.Seq[T] { + return func(yield func(T) bool) { + it, err := q.store.ScanComplex(q.keyGroup, q.key, q.namespace) + if err != nil { + return + } + defer it.Close() + + for { + has, err := it.HasNext() + if err != nil || !has { + return + } + userKey, _, ok, err := it.Next() + if err != nil || !ok { + return + } + + v, err := q.valueCodec.Decode(userKey) + if err != nil { + continue // skip entry on decode error to avoid yielding corrupted zero values + } + if !yield(v) { + return + } + } + } +} diff --git a/go-sdk-advanced/structures/reducing.go b/go-sdk-advanced/structures/reducing.go new file mode 100644 index 00000000..c9588bee --- /dev/null +++ b/go-sdk-advanced/structures/reducing.go @@ -0,0 +1,138 @@ +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package structures + +import ( + "fmt" + + "github.com/functionstream/function-stream/go-sdk-advanced/codec" + "github.com/functionstream/function-stream/go-sdk/api" + "github.com/functionstream/function-stream/go-sdk/state/common" +) + +type ReduceFunc[V any] func(value1 V, value2 V) (V, error) + +type ReducingState[V any] struct { + store common.Store + complexKey api.ComplexKey + valueCodec codec.Codec[V] + reduceFunc ReduceFunc[V] +} + +// NewReducingStateFromContext creates a ReducingState using the store from ctx.GetOrCreateStore(storeName). +func NewReducingStateFromContext[V any]( + ctx api.Context, + storeName string, + valueCodec codec.Codec[V], + reduceFunc ReduceFunc[V], +) (*ReducingState[V], error) { + store, err := ctx.GetOrCreateStore(storeName) + if err != nil { + return nil, err + } + return newReducingState(store, valueCodec, reduceFunc) +} + +// NewReducingStateFromContextAutoCodec creates a ReducingState with default value codec from ctx.GetOrCreateStore(storeName). +func NewReducingStateFromContextAutoCodec[V any]( + ctx api.Context, + storeName string, + reduceFunc ReduceFunc[V], +) (*ReducingState[V], error) { + store, err := ctx.GetOrCreateStore(storeName) + if err != nil { + return nil, err + } + valueCodec, err := codec.DefaultCodecFor[V]() + if err != nil { + return nil, err + } + return newReducingState(store, valueCodec, reduceFunc) +} + +func newReducingState[V any]( + store common.Store, + valueCodec codec.Codec[V], + reduceFunc ReduceFunc[V], +) (*ReducingState[V], error) { + if store == nil { + return nil, api.NewError(api.ErrStoreInternal, "reducing state store must not be nil") + } + if valueCodec == nil || reduceFunc == nil { + return nil, api.NewError(api.ErrStoreInternal, "reducing state value codec and reduce function are required") + } + ck := api.ComplexKey{ + KeyGroup: []byte{}, + Key: []byte{}, + Namespace: []byte{}, + UserKey: []byte{}, + } + return &ReducingState[V]{ + store: store, + complexKey: ck, + valueCodec: valueCodec, + reduceFunc: reduceFunc, + }, nil +} + +func (s *ReducingState[V]) buildCK() api.ComplexKey { + return s.complexKey +} + +func (s *ReducingState[V]) Add(value V) error { + ck := s.buildCK() + raw, found, err := s.store.Get(ck) + if err != nil { + return fmt.Errorf("failed to get old value for reducing state: %w", err) + } + + var result V + if !found { + result = value + } else { + oldValue, err := s.valueCodec.Decode(raw) + if err != nil { + return fmt.Errorf("failed to decode old value: %w", err) + } + + result, err = s.reduceFunc(oldValue, value) + if err != nil { + return fmt.Errorf("error in user reduce function: %w", err) + } + } + + encoded, err := s.valueCodec.Encode(result) + if err != nil { + return fmt.Errorf("failed to encode reduced value: %w", err) + } + + return s.store.Put(ck, encoded) +} + +func (s *ReducingState[V]) Get() (V, bool, error) { + var zero V + ck := s.buildCK() + raw, found, err := s.store.Get(ck) + if err != nil || !found { + return zero, found, err + } + val, err := s.valueCodec.Decode(raw) + if err != nil { + return zero, false, fmt.Errorf("failed to decode value: %w", err) + } + return val, true, nil +} + +func (s *ReducingState[V]) Clear() error { + return s.store.Delete(s.buildCK()) +} diff --git a/go-sdk-advanced/structures/value.go b/go-sdk-advanced/structures/value.go new file mode 100644 index 00000000..a7a9ce4a --- /dev/null +++ b/go-sdk-advanced/structures/value.go @@ -0,0 +1,97 @@ +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +// Package structures provides state types (ValueState, ListState, MapState, etc.) +// for the Advanced SDK. Depends on go-sdk (low-level) and go-sdk-advanced/codec. +package structures + +import ( + "fmt" + + "github.com/functionstream/function-stream/go-sdk-advanced/codec" + "github.com/functionstream/function-stream/go-sdk/api" + "github.com/functionstream/function-stream/go-sdk/state/common" +) + +type ValueState[T any] struct { + store common.Store + complexKey api.ComplexKey + valueCodec codec.Codec[T] +} + +// NewValueStateFromContext creates a ValueState using the store from ctx.GetOrCreateStore(storeName). +func NewValueStateFromContext[T any](ctx api.Context, storeName string, valueCodec codec.Codec[T]) (*ValueState[T], error) { + store, err := ctx.GetOrCreateStore(storeName) + if err != nil { + return nil, err + } + return newValueState(store, valueCodec) +} + +// NewValueStateFromContextAutoCodec creates a ValueState with default codec for T from ctx.GetOrCreateStore(storeName). +func NewValueStateFromContextAutoCodec[T any](ctx api.Context, storeName string) (*ValueState[T], error) { + store, err := ctx.GetOrCreateStore(storeName) + if err != nil { + return nil, err + } + valueCodec, err := codec.DefaultCodecFor[T]() + if err != nil { + return nil, err + } + return newValueState(store, valueCodec) +} + +func newValueState[T any](store common.Store, valueCodec codec.Codec[T]) (*ValueState[T], error) { + if store == nil { + return nil, api.NewError(api.ErrStoreInternal, "value state store must not be nil") + } + if valueCodec == nil { + return nil, api.NewError(api.ErrStoreInternal, "value state codec must not be nil") + } + ck := api.ComplexKey{ + KeyGroup: []byte{}, + Key: []byte{}, + Namespace: []byte{}, + UserKey: []byte{}, + } + return &ValueState[T]{store: store, complexKey: ck, valueCodec: valueCodec}, nil +} + +func (v *ValueState[T]) buildCK() api.ComplexKey { + return v.complexKey +} + +func (v *ValueState[T]) Update(value T) error { + encoded, err := v.valueCodec.Encode(value) + if err != nil { + return fmt.Errorf("encode value state failed: %w", err) + } + return v.store.Put(v.buildCK(), encoded) +} + +func (v *ValueState[T]) Value() (T, bool, error) { + var zero T + ck := v.buildCK() + raw, found, err := v.store.Get(ck) + if err != nil || !found { + return zero, found, err + } + decoded, err := v.valueCodec.Decode(raw) + if err != nil { + return zero, false, fmt.Errorf("decode value state failed: %w", err) + } + return decoded, true, nil +} + +func (v *ValueState[T]) Clear() error { + return v.store.Delete(v.buildCK()) +} diff --git a/go-sdk/impl/context.go b/go-sdk/impl/context.go index 54d374a1..e5cc3c8f 100644 --- a/go-sdk/impl/context.go +++ b/go-sdk/impl/context.go @@ -14,14 +14,12 @@ package impl import ( "strings" - "sync" "github.com/functionstream/function-stream/go-sdk/api" "github.com/functionstream/function-stream/go-sdk/bindings/functionstream/core/collector" ) type runtimeContext struct { - mu sync.RWMutex config map[string]string stores map[string]*storeImpl closed bool @@ -35,9 +33,7 @@ func newRuntimeContext(config map[string]string) *runtimeContext { } func (c *runtimeContext) Emit(targetID uint32, data []byte) error { - c.mu.RLock() closed := c.closed - c.mu.RUnlock() if closed { return api.NewError(api.ErrRuntimeClosed, "emit on closed context") } @@ -46,9 +42,7 @@ func (c *runtimeContext) Emit(targetID uint32, data []byte) error { } func (c *runtimeContext) EmitWatermark(targetID uint32, watermark uint64) error { - c.mu.RLock() closed := c.closed - c.mu.RUnlock() if closed { return api.NewError(api.ErrRuntimeClosed, "emit watermark on closed context") } @@ -62,8 +56,6 @@ func (c *runtimeContext) GetOrCreateStore(name string) (api.Store, error) { return nil, api.NewError(api.ErrStoreInvalidName, "store name must not be empty") } - c.mu.Lock() - defer c.mu.Unlock() if c.closed { return nil, api.NewError(api.ErrRuntimeClosed, "store request on closed context") } @@ -77,21 +69,16 @@ func (c *runtimeContext) GetOrCreateStore(name string) (api.Store, error) { } func (c *runtimeContext) Config() map[string]string { - c.mu.RLock() - defer c.mu.RUnlock() return cloneStringMap(c.config) } func (c *runtimeContext) Close() error { - c.mu.Lock() if c.closed { - c.mu.Unlock() return nil } c.closed = true stores := c.stores c.stores = make(map[string]*storeImpl) - c.mu.Unlock() var firstErr error for _, store := range stores { diff --git a/go-sdk/state/common/common.go b/go-sdk/state/common/common.go new file mode 100644 index 00000000..b889fb28 --- /dev/null +++ b/go-sdk/state/common/common.go @@ -0,0 +1,28 @@ +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package common + +import ( + "github.com/functionstream/function-stream/go-sdk/api" +) + +type Store = api.Store + +func DupBytes(input []byte) []byte { + if input == nil { + return nil + } + out := make([]byte, len(input)) + copy(out, input) + return out +} diff --git a/python/functionstream-api-advanced/pyproject.toml b/python/functionstream-api-advanced/pyproject.toml new file mode 100644 index 00000000..01f7c606 --- /dev/null +++ b/python/functionstream-api-advanced/pyproject.toml @@ -0,0 +1,34 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +[build-system] +requires = ["setuptools>=61.0", "wheel"] +build-backend = "setuptools.build_meta" + +[project] +name = "functionstream-api-advanced" +version = "0.1.0" +description = "Function Stream Advanced State API - high-level state types (experimental), depends on functionstream-api" +requires-python = ">=3.7" +license = "Apache-2.0" +authors = [ + {name = "Function Stream Team"} +] +dependencies = [ + "functionstream-api>=0.6.0", +] + +[tool.setuptools.packages.find] +where = ["src"] + +[tool.setuptools.package-dir] +"" = "src" diff --git a/python/functionstream-api-advanced/src/fs_api_advanced/__init__.py b/python/functionstream-api-advanced/src/fs_api_advanced/__init__.py new file mode 100644 index 00000000..a01a80ce --- /dev/null +++ b/python/functionstream-api-advanced/src/fs_api_advanced/__init__.py @@ -0,0 +1,93 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +Function Stream Advanced State API. + +This library depends on functionstream-api (low-level). It provides +Codec, ValueState, ListState, MapState, PriorityQueueState, +AggregatingState, ReducingState, and all Keyed* state types. +""" + +from .codec import ( + Codec, + JsonCodec, + PickleCodec, + BytesCodec, + StringCodec, + BoolCodec, + IntCodec, + FloatCodec, + default_codec_for, +) +from .structures import ( + ValueState, + ListState, + MapEntry, + MapState, + infer_ordered_key_codec, + create_map_state_auto_key_codec, + PriorityQueueState, + AggregateFunc, + AggregatingState, + ReduceFunc, + ReducingState, +) +from .keyed import ( + KeyedListStateFactory, + KeyedValueStateFactory, + KeyedMapStateFactory, + KeyedPriorityQueueStateFactory, + KeyedAggregatingStateFactory, + KeyedReducingStateFactory, + KeyedValueState, + KeyedMapState, + KeyedListState, + KeyedPriorityQueueState, + KeyedAggregatingState, + KeyedReducingState, +) + +__all__ = [ + "Codec", + "JsonCodec", + "PickleCodec", + "BytesCodec", + "StringCodec", + "BoolCodec", + "IntCodec", + "FloatCodec", + "default_codec_for", + "ValueState", + "ListState", + "MapEntry", + "MapState", + "infer_ordered_key_codec", + "create_map_state_auto_key_codec", + "PriorityQueueState", + "AggregateFunc", + "AggregatingState", + "ReduceFunc", + "ReducingState", + "KeyedListStateFactory", + "KeyedValueStateFactory", + "KeyedMapStateFactory", + "KeyedPriorityQueueStateFactory", + "KeyedAggregatingStateFactory", + "KeyedReducingStateFactory", + "KeyedValueState", + "KeyedMapState", + "KeyedListState", + "KeyedPriorityQueueState", + "KeyedAggregatingState", + "KeyedReducingState", +] diff --git a/python/functionstream-api-advanced/src/fs_api_advanced/codec/__init__.py b/python/functionstream-api-advanced/src/fs_api_advanced/codec/__init__.py new file mode 100644 index 00000000..3c3d4ba1 --- /dev/null +++ b/python/functionstream-api-advanced/src/fs_api_advanced/codec/__init__.py @@ -0,0 +1,33 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from .base import Codec +from .json_codec import JsonCodec +from .pickle_codec import PickleCodec +from .bytes_codec import BytesCodec +from .string_codec import StringCodec +from .bool_codec import BoolCodec +from .int_codec import IntCodec +from .float_codec import FloatCodec +from .default_codec import default_codec_for + +__all__ = [ + "Codec", + "JsonCodec", + "PickleCodec", + "BytesCodec", + "StringCodec", + "BoolCodec", + "IntCodec", + "FloatCodec", + "default_codec_for", +] diff --git a/python/functionstream-api-advanced/src/fs_api_advanced/codec/base.py b/python/functionstream-api-advanced/src/fs_api_advanced/codec/base.py new file mode 100644 index 00000000..80e05b17 --- /dev/null +++ b/python/functionstream-api-advanced/src/fs_api_advanced/codec/base.py @@ -0,0 +1,29 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Generic, TypeVar + +T = TypeVar("T") + + +class Codec(Generic[T]): + supports_ordered_keys: bool = False + + def encode(self, value: T) -> bytes: + raise NotImplementedError + + def decode(self, data: bytes) -> T: + raise NotImplementedError + + def encoded_size(self) -> int: + """Return fixed encoded byte length; >0 means fixed-size, 0 or negative means variable length.""" + return 0 diff --git a/python/functionstream-api-advanced/src/fs_api_advanced/codec/bool_codec.py b/python/functionstream-api-advanced/src/fs_api_advanced/codec/bool_codec.py new file mode 100644 index 00000000..f151ff6a --- /dev/null +++ b/python/functionstream-api-advanced/src/fs_api_advanced/codec/bool_codec.py @@ -0,0 +1,33 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from .base import Codec + + +class BoolCodec(Codec[bool]): + supports_ordered_keys = True + _SIZE = 1 + + def encoded_size(self) -> int: + return self._SIZE + + def encode(self, value: bool) -> bytes: + return b"\x01" if value else b"\x00" + + def decode(self, data: bytes) -> bool: + if len(data) != 1: + raise ValueError(f"invalid bool payload length: {len(data)}") + if data == b"\x00": + return False + if data == b"\x01": + return True + raise ValueError(f"invalid bool payload byte: {data[0]}") diff --git a/python/functionstream-api-advanced/src/fs_api_advanced/codec/bytes_codec.py b/python/functionstream-api-advanced/src/fs_api_advanced/codec/bytes_codec.py new file mode 100644 index 00000000..3ac38b15 --- /dev/null +++ b/python/functionstream-api-advanced/src/fs_api_advanced/codec/bytes_codec.py @@ -0,0 +1,23 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from .base import Codec + + +class BytesCodec(Codec[bytes]): + supports_ordered_keys = True + + def encode(self, value: bytes) -> bytes: + return bytes(value) + + def decode(self, data: bytes) -> bytes: + return bytes(data) diff --git a/python/functionstream-api-advanced/src/fs_api_advanced/codec/default_codec.py b/python/functionstream-api-advanced/src/fs_api_advanced/codec/default_codec.py new file mode 100644 index 00000000..67aa17fd --- /dev/null +++ b/python/functionstream-api-advanced/src/fs_api_advanced/codec/default_codec.py @@ -0,0 +1,47 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Any, Type + +from .base import Codec +from .bool_codec import BoolCodec +from .bytes_codec import BytesCodec +from .float_codec import FloatCodec +from .int_codec import IntCodec +from .json_codec import JsonCodec +from .pickle_codec import PickleCodec +from .string_codec import StringCodec + + +def default_codec_for(value_type: Type[Any]) -> Codec[Any]: + """ + Return a default Codec for the given type, aligned with Go DefaultCodecFor. + Built-in types use ordered codecs where applicable; list/dict use JsonCodec; else PickleCodec. + """ + if value_type is bool: + return BoolCodec() + if value_type is int: + return IntCodec() + if value_type is float: + return FloatCodec() + if value_type is str: + return StringCodec() + if value_type is bytes: + return BytesCodec() + try: + if issubclass(value_type, int) and value_type is not bool: + return IntCodec() + except TypeError: + pass + if value_type is list or value_type is dict: + return JsonCodec() + return PickleCodec() diff --git a/python/functionstream-api-advanced/src/fs_api_advanced/codec/float_codec.py b/python/functionstream-api-advanced/src/fs_api_advanced/codec/float_codec.py new file mode 100644 index 00000000..1d8078f7 --- /dev/null +++ b/python/functionstream-api-advanced/src/fs_api_advanced/codec/float_codec.py @@ -0,0 +1,42 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import struct + +from .base import Codec + + +class FloatCodec(Codec[float]): + """Ordered float codec for range scans (lexicographic byte order).""" + supports_ordered_keys = True + _SIZE = 8 + + def encoded_size(self) -> int: + return self._SIZE + + def encode(self, value: float) -> bytes: + bits = struct.unpack(">Q", struct.pack(">d", value))[0] + if bits & (1 << 63): + mapped = (~bits) & 0xFFFFFFFFFFFFFFFF + else: + mapped = bits ^ (1 << 63) + return struct.pack(">Q", mapped) + + def decode(self, data: bytes) -> float: + if len(data) != 8: + raise ValueError(f"invalid float payload length: {len(data)}") + mapped = struct.unpack(">Q", data)[0] + if mapped & (1 << 63): + bits = mapped ^ (1 << 63) + else: + bits = (~mapped) & 0xFFFFFFFFFFFFFFFF + return struct.unpack(">d", struct.pack(">Q", bits))[0] diff --git a/python/functionstream-api-advanced/src/fs_api_advanced/codec/int_codec.py b/python/functionstream-api-advanced/src/fs_api_advanced/codec/int_codec.py new file mode 100644 index 00000000..11e92725 --- /dev/null +++ b/python/functionstream-api-advanced/src/fs_api_advanced/codec/int_codec.py @@ -0,0 +1,46 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import struct + +from .base import Codec + + +class IntCodec(Codec[int]): + supports_ordered_keys = True + + _PACKER = struct.Struct(">Q") + _SIZE = 8 + + def encoded_size(self) -> int: + return self._SIZE + _MASK_64 = 0xFFFFFFFFFFFFFFFF + _SIGN_BIT = 1 << 63 + _TWO_TO_64 = 1 << 64 + + def encode(self, value: int) -> bytes: + if not (-self._SIGN_BIT <= value < self._SIGN_BIT): + raise ValueError(f"Value out of 64-bit signed integer range: {value}") + mapped = (value & self._MASK_64) ^ self._SIGN_BIT + return self._PACKER.pack(mapped) + + def decode(self, data: bytes) -> int: + try: + mapped = self._PACKER.unpack(data)[0] + except struct.error: + raise ValueError( + f"Invalid int payload length or format, expected 8 bytes, got {len(data)}" + ) + raw = mapped ^ self._SIGN_BIT + if raw >= self._SIGN_BIT: + return raw - self._TWO_TO_64 + return raw diff --git a/python/functionstream-api-advanced/src/fs_api_advanced/codec/json_codec.py b/python/functionstream-api-advanced/src/fs_api_advanced/codec/json_codec.py new file mode 100644 index 00000000..b35416d2 --- /dev/null +++ b/python/functionstream-api-advanced/src/fs_api_advanced/codec/json_codec.py @@ -0,0 +1,26 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import json +from typing import TypeVar + +from .base import Codec + +T = TypeVar("T") + + +class JsonCodec(Codec[T]): + def encode(self, value: T) -> bytes: + return json.dumps(value).encode("utf-8") + + def decode(self, data: bytes) -> T: + return json.loads(data.decode("utf-8")) diff --git a/python/functionstream-api-advanced/src/fs_api_advanced/codec/pickle_codec.py b/python/functionstream-api-advanced/src/fs_api_advanced/codec/pickle_codec.py new file mode 100644 index 00000000..84066fc2 --- /dev/null +++ b/python/functionstream-api-advanced/src/fs_api_advanced/codec/pickle_codec.py @@ -0,0 +1,27 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import TypeVar + +import cloudpickle + +from .base import Codec + +T = TypeVar("T") + + +class PickleCodec(Codec[T]): + def encode(self, value: T) -> bytes: + return cloudpickle.dumps(value) + + def decode(self, data: bytes) -> T: + return cloudpickle.loads(data) diff --git a/python/functionstream-api-advanced/src/fs_api_advanced/codec/string_codec.py b/python/functionstream-api-advanced/src/fs_api_advanced/codec/string_codec.py new file mode 100644 index 00000000..6bb112a4 --- /dev/null +++ b/python/functionstream-api-advanced/src/fs_api_advanced/codec/string_codec.py @@ -0,0 +1,23 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from .base import Codec + + +class StringCodec(Codec[str]): + supports_ordered_keys = True + + def encode(self, value: str) -> bytes: + return value.encode("utf-8") + + def decode(self, data: bytes) -> str: + return data.decode("utf-8") diff --git a/python/functionstream-api-advanced/src/fs_api_advanced/keyed/__init__.py b/python/functionstream-api-advanced/src/fs_api_advanced/keyed/__init__.py new file mode 100644 index 00000000..e575ec37 --- /dev/null +++ b/python/functionstream-api-advanced/src/fs_api_advanced/keyed/__init__.py @@ -0,0 +1,35 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from .keyed_value_state import KeyedValueState, KeyedValueStateFactory +from .keyed_list_state import KeyedListState, KeyedListStateFactory +from .keyed_map_state import KeyedMapState, KeyedMapStateFactory +from .keyed_priority_queue_state import KeyedPriorityQueueState, KeyedPriorityQueueStateFactory +from .keyed_aggregating_state import AggregateFunc, KeyedAggregatingState, KeyedAggregatingStateFactory +from .keyed_reducing_state import KeyedReducingState, KeyedReducingStateFactory, ReduceFunc + +__all__ = [ + "KeyedListStateFactory", + "KeyedValueStateFactory", + "KeyedMapStateFactory", + "KeyedPriorityQueueStateFactory", + "KeyedAggregatingStateFactory", + "KeyedReducingStateFactory", + "KeyedValueState", + "KeyedListState", + "KeyedMapState", + "KeyedPriorityQueueState", + "KeyedAggregatingState", + "KeyedReducingState", + "AggregateFunc", + "ReduceFunc", +] diff --git a/python/functionstream-api-advanced/src/fs_api_advanced/keyed/_keyed_common.py b/python/functionstream-api-advanced/src/fs_api_advanced/keyed/_keyed_common.py new file mode 100644 index 00000000..03c1822a --- /dev/null +++ b/python/functionstream-api-advanced/src/fs_api_advanced/keyed/_keyed_common.py @@ -0,0 +1,24 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Any + +from fs_api.store import KvError + +def ensure_ordered_key_codec(codec: Any, label: str) -> None: + if not getattr(codec, "supports_ordered_keys", False): + raise KvError(f"{label} key codec must support ordered key encoding") + + +__all__ = [ + "ensure_ordered_key_codec", +] diff --git a/python/functionstream-api-advanced/src/fs_api_advanced/keyed/keyed_aggregating_state.py b/python/functionstream-api-advanced/src/fs_api_advanced/keyed/keyed_aggregating_state.py new file mode 100644 index 00000000..e9afbef0 --- /dev/null +++ b/python/functionstream-api-advanced/src/fs_api_advanced/keyed/keyed_aggregating_state.py @@ -0,0 +1,144 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Any, Generic, Optional, Protocol, Tuple, TypeVar + +from fs_api.store import ComplexKey, KvError, KvStore +from fs_api_advanced.codec import Codec, default_codec_for + +T_agg = TypeVar("T_agg") +ACC = TypeVar("ACC") +R = TypeVar("R") + + +class AggregateFunc(Protocol[T_agg, ACC, R]): + def create_accumulator(self) -> ACC: ... + def add(self, value: T_agg, accumulator: ACC) -> ACC: ... + def get_result(self, accumulator: ACC) -> R: ... + def merge(self, a: ACC, b: ACC) -> ACC: ... + + +class KeyedAggregatingStateFactory(Generic[T_agg, ACC, R]): + """Factory for keyed aggregating state. Create from context with key_group; obtain state per (primary_key, state_name).""" + + def __init__( + self, + store: KvStore, + key_group: bytes, + acc_codec: Codec[ACC], + agg_func: AggregateFunc[T_agg, ACC, R], + ): + if store is None: + raise KvError("keyed aggregating state factory store must not be None") + if key_group is None: + raise KvError("keyed aggregating state factory key_group must not be None") + if acc_codec is None: + raise KvError("keyed aggregating state factory acc_codec must not be None") + if agg_func is None: + raise KvError("keyed aggregating state factory agg_func must not be None") + self._store = store + self._key_group = key_group + self._acc_codec = acc_codec + self._agg_func = agg_func + + @classmethod + def from_context( + cls, + ctx: Any, + store_name: str, + key_group: bytes, + acc_codec: Codec[ACC], + agg_func: AggregateFunc[T_agg, ACC, R], + ) -> "KeyedAggregatingStateFactory[T_agg, ACC, R]": + """Create a KeyedAggregatingStateFactory from a context and store name (for keyed operators).""" + store = ctx.getOrCreateKVStore(store_name) + return cls(store, key_group, acc_codec, agg_func) + + @classmethod + def from_context_auto_codec( + cls, + ctx: Any, + store_name: str, + key_group: bytes, + agg_func: AggregateFunc[T_agg, ACC, R], + acc_type: Optional[type] = None, + ) -> "KeyedAggregatingStateFactory[T_agg, ACC, R]": + """Create a KeyedAggregatingStateFactory with default accumulator codec from context and store name.""" + store = ctx.getOrCreateKVStore(store_name) + if acc_type is None: + raise KvError("keyed aggregating state from_context_auto_codec requires acc_type") + codec = default_codec_for(acc_type) + return cls(store, key_group, codec, agg_func) + + def new_aggregating_state( + self, primary_key: bytes, state_name: str = "" + ) -> "KeyedAggregatingState[T_agg, ACC, R]": + """Create a KeyedAggregatingState for the given primary key and state name (state_name becomes namespace bytes).""" + if primary_key is None: + raise KvError("keyed aggregating state primary_key must not be None") + namespace = state_name.encode("utf-8") if state_name else b"" + return KeyedAggregatingState(self, primary_key, namespace) + + +class KeyedAggregatingState(Generic[T_agg, ACC, R]): + """Aggregating state for one (primary_key, namespace). Use add(value) to merge, get() to read result, clear() to remove.""" + + def __init__( + self, + factory: KeyedAggregatingStateFactory[T_agg, ACC, R], + primary_key: bytes, + namespace: bytes, + ): + if factory is None: + raise KvError("keyed aggregating state factory must not be None") + if primary_key is None: + raise KvError("keyed aggregating state primary_key must not be None") + if namespace is None: + raise KvError("keyed aggregating state namespace must not be None") + self._factory = factory + self._primary_key = primary_key + self._namespace = namespace + + def _complex_key(self) -> ComplexKey: + return ComplexKey( + key_group=self._factory._key_group, + key=self._primary_key, + namespace=self._namespace, + user_key=b"", + ) + + def add(self, value: T_agg) -> None: + """Add a value into this state's accumulator (get or create acc, add, put back).""" + ck = self._complex_key() + raw = self._factory._store.get(ck) + if raw is None: + acc = self._factory._agg_func.create_accumulator() + else: + acc = self._factory._acc_codec.decode(raw) + new_acc = self._factory._agg_func.add(value, acc) + self._factory._store.put(ck, self._factory._acc_codec.encode(new_acc)) + + def get(self) -> Tuple[Optional[R], bool]: + """Return (result, found). found is False when no accumulator exists.""" + ck = self._complex_key() + raw = self._factory._store.get(ck) + if raw is None: + return (None, False) + acc = self._factory._acc_codec.decode(raw) + return (self._factory._agg_func.get_result(acc), True) + + def clear(self) -> None: + """Remove this state's accumulator.""" + self._factory._store.delete(self._complex_key()) + + +__all__ = ["AggregateFunc", "KeyedAggregatingState", "KeyedAggregatingStateFactory"] diff --git a/python/functionstream-api-advanced/src/fs_api_advanced/keyed/keyed_list_state.py b/python/functionstream-api-advanced/src/fs_api_advanced/keyed/keyed_list_state.py new file mode 100644 index 00000000..70627dd8 --- /dev/null +++ b/python/functionstream-api-advanced/src/fs_api_advanced/keyed/keyed_list_state.py @@ -0,0 +1,208 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import struct +from typing import Any, Generic, List, Optional, TypeVar + +from fs_api.store import ComplexKey, KvError, KvStore +from fs_api_advanced.codec import Codec, default_codec_for + +V = TypeVar("V") + + +def _fixed_encoded_size(codec: Codec[Any]) -> tuple[int, bool]: + """Return (fixed_size, is_fixed). is_fixed is True when encoded_size() > 0.""" + n = codec.encoded_size() + return (n, n > 0) + + +class KeyedListStateFactory(Generic[V]): + """Factory for keyed list state. Create from context with key_group; obtain list per (key, namespace).""" + + def __init__( + self, + store: KvStore, + key_group: bytes, + value_codec: Codec[V], + ): + if store is None: + raise KvError("keyed list state factory store must not be None") + if key_group is None: + raise KvError("keyed list state factory key_group must not be None") + if value_codec is None: + raise KvError("keyed list value codec must not be None") + self._store = store + self._key_group = key_group + self._value_codec = value_codec + self._fixed_size, self._is_fixed = _fixed_encoded_size(value_codec) + + @classmethod + def from_context( + cls, + ctx: Any, + store_name: str, + key_group: bytes, + value_codec: Codec[V], + ) -> "KeyedListStateFactory[V]": + """Create a KeyedListStateFactory from a context and store name (for keyed operators).""" + store = ctx.getOrCreateKVStore(store_name) + return cls(store, key_group, value_codec) + + @classmethod + def from_context_auto_codec( + cls, + ctx: Any, + store_name: str, + key_group: bytes, + value_type: Optional[type] = None, + ) -> "KeyedListStateFactory[V]": + """Create a KeyedListStateFactory with default value codec from context and store name.""" + store = ctx.getOrCreateKVStore(store_name) + if value_type is None: + raise KvError("keyed list state from_context_auto_codec requires value_type") + codec = default_codec_for(value_type) + return cls(store, key_group, codec) + + def new_keyed_list(self, key: bytes, namespace: bytes) -> "KeyedListState[V]": + """Create a KeyedListState for the given key and namespace (e.g. stream key and window namespace).""" + return KeyedListState(self, key, namespace) + + +class KeyedListState(Generic[V]): + """List state for one (key, namespace). Use add/add_all to append, get to read, update to replace, clear to remove.""" + + def __init__(self, factory: KeyedListStateFactory[V], key: bytes, namespace: bytes): + if factory is None: + raise KvError("keyed list factory must not be None") + if key is None: + raise KvError("keyed list key must not be None") + if namespace is None: + raise KvError("keyed list namespace must not be None") + self._factory = factory + self._key = key + self._namespace = namespace + + def _complex_key(self) -> ComplexKey: + return ComplexKey( + key_group=self._factory._key_group, + key=self._key, + namespace=self._namespace, + user_key=b"", + ) + + def add(self, value: V) -> None: + payload = self._serialize_one(value) + self._factory._store.merge(self._complex_key(), payload) + + def add_all(self, values: List[V]) -> None: + if not values: + return + payload = self._serialize_batch(values) + self._factory._store.merge(self._complex_key(), payload) + + def get(self) -> List[V]: + raw = self._factory._store.get(self._complex_key()) + if raw is None: + return [] + return self._deserialize(raw) + + def update(self, values: List[V]) -> None: + """Replace the list with the given values (clear then put batch, matching Go).""" + self.clear() + payload = self._serialize_batch(values) + self._factory._store.put(self._complex_key(), payload) + + def clear(self) -> None: + self._factory._store.delete(self._complex_key()) + + def _serialize_one(self, value: V) -> bytes: + if self._factory._is_fixed: + return self._serialize_one_fixed(value) + return self._serialize_one_var_len(value) + + def _serialize_batch(self, values: List[V]) -> bytes: + if self._factory._is_fixed: + return self._serialize_batch_fixed(values) + return self._serialize_batch_var_len(values) + + def _deserialize(self, raw: bytes) -> List[V]: + if self._factory._is_fixed: + return self._deserialize_fixed(raw) + return self._deserialize_var_len(raw) + + def _serialize_one_var_len(self, value: V) -> bytes: + encoded = self._factory._value_codec.encode(value) + return struct.pack(">I", len(encoded)) + encoded + + def _serialize_batch_var_len(self, values: List[V]) -> bytes: + parts: List[bytes] = [] + for v in values: + encoded = self._factory._value_codec.encode(v) + parts.append(struct.pack(">I", len(encoded)) + encoded) + return b"".join(parts) + + def _deserialize_var_len(self, raw: bytes) -> List[V]: + out: List[V] = [] + idx = 0 + while idx < len(raw): + if len(raw) - idx < 4: + raise KvError("corrupted keyed list payload: truncated length") + (item_len,) = struct.unpack(">I", raw[idx : idx + 4]) + idx += 4 + if item_len < 0 or len(raw) - idx < item_len: + raise KvError("corrupted keyed list payload: invalid element length") + item_raw = raw[idx : idx + item_len] + idx += item_len + out.append(self._factory._value_codec.decode(item_raw)) + return out + + def _serialize_one_fixed(self, value: V) -> bytes: + fixed = self._factory._fixed_size + if fixed <= 0: + raise KvError("fixed-size codec must report positive size") + encoded = self._factory._value_codec.encode(value) + if len(encoded) != fixed: + raise KvError( + f"fixed-size codec encoded unexpected length: got {len(encoded)}, want {fixed}" + ) + return encoded + + def _serialize_batch_fixed(self, values: List[V]) -> bytes: + fixed = self._factory._fixed_size + if fixed <= 0: + raise KvError("fixed-size codec must report positive size") + parts: List[bytes] = [] + for v in values: + encoded = self._factory._value_codec.encode(v) + if len(encoded) != fixed: + raise KvError( + f"fixed-size codec encoded unexpected length: got {len(encoded)}, want {fixed}" + ) + parts.append(encoded) + return b"".join(parts) + + def _deserialize_fixed(self, raw: bytes) -> List[V]: + fixed = self._factory._fixed_size + if fixed <= 0: + raise KvError("fixed-size codec must report positive size") + if len(raw) % fixed != 0: + raise KvError("corrupted keyed list payload: fixed-size data length mismatch") + out: List[V] = [] + idx = 0 + while idx < len(raw): + item_raw = raw[idx : idx + fixed] + idx += fixed + out.append(self._factory._value_codec.decode(item_raw)) + return out + + +__all__ = ["KeyedListState", "KeyedListStateFactory"] diff --git a/python/functionstream-api-advanced/src/fs_api_advanced/keyed/keyed_map_state.py b/python/functionstream-api-advanced/src/fs_api_advanced/keyed/keyed_map_state.py new file mode 100644 index 00000000..08090918 --- /dev/null +++ b/python/functionstream-api-advanced/src/fs_api_advanced/keyed/keyed_map_state.py @@ -0,0 +1,173 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Any, Generic, Iterator, Optional, Tuple, TypeVar + +from fs_api.store import ComplexKey, KvError, KvStore +from fs_api_advanced.codec import Codec, default_codec_for + +from ._keyed_common import ensure_ordered_key_codec + +MK = TypeVar("MK") +MV = TypeVar("MV") + + +class KeyedMapStateFactory(Generic[MK, MV]): + """Factory for keyed map state. Create from context with key_group; obtain map per (primary_key, map_name).""" + + def __init__( + self, + store: KvStore, + key_group: bytes, + map_key_codec: Codec[MK], + map_value_codec: Codec[MV], + ): + if store is None: + raise KvError("keyed map state factory store must not be None") + if key_group is None: + raise KvError("keyed map state factory key_group must not be None") + if map_key_codec is None or map_value_codec is None: + raise KvError( + "keyed map state factory map_key_codec and map_value_codec must not be None" + ) + ensure_ordered_key_codec(map_key_codec, "keyed map") + self._store = store + self._key_group = key_group + self._map_key_codec = map_key_codec + self._map_value_codec = map_value_codec + + @classmethod + def from_context( + cls, + ctx: Any, + store_name: str, + key_group: bytes, + map_key_codec: Codec[MK], + map_value_codec: Codec[MV], + ) -> "KeyedMapStateFactory[MK, MV]": + """Create a KeyedMapStateFactory from a context and store name (for keyed operators).""" + store = ctx.getOrCreateKVStore(store_name) + return cls(store, key_group, map_key_codec, map_value_codec) + + @classmethod + def from_context_auto_codec( + cls, + ctx: Any, + store_name: str, + key_group: bytes, + map_key_type: Optional[type] = None, + map_value_type: Optional[type] = None, + ) -> "KeyedMapStateFactory[MK, MV]": + """Create a KeyedMapStateFactory with default codecs for MK and MV. Map key type must have an ordered default codec.""" + store = ctx.getOrCreateKVStore(store_name) + if map_key_type is None: + raise KvError("keyed map state from_context_auto_codec requires map_key_type") + if map_value_type is None: + raise KvError("keyed map state from_context_auto_codec requires map_value_type") + map_key_codec = default_codec_for(map_key_type) + map_value_codec = default_codec_for(map_value_type) + return cls(store, key_group, map_key_codec, map_value_codec) + + def new_keyed_map( + self, primary_key: bytes, map_name: str + ) -> "KeyedMapState[MK, MV]": + """Create a KeyedMapState for the given primary key and map name (map_name becomes namespace).""" + if primary_key is None: + raise KvError("keyed map state primary_key must not be None") + if not map_name: + raise KvError("keyed map state map_name is required") + namespace = map_name.encode("utf-8") + return KeyedMapState(self, primary_key, namespace) + + +class KeyedMapState(Generic[MK, MV]): + """Map state for one (primary_key, namespace). put/get/delete by map_key; clear() removes all entries; all() iterates (mk, mv).""" + + def __init__( + self, + factory: KeyedMapStateFactory[MK, MV], + primary_key: bytes, + namespace: bytes, + ): + if factory is None: + raise KvError("keyed map state factory must not be None") + if primary_key is None: + raise KvError("keyed map state primary_key must not be None") + if namespace is None: + raise KvError("keyed map state namespace must not be None") + self._factory = factory + self._primary_key = primary_key + self._namespace = namespace + + def _complex_key(self, map_key: MK) -> ComplexKey: + encoded = self._factory._map_key_codec.encode(map_key) + return ComplexKey( + key_group=self._factory._key_group, + key=self._primary_key, + namespace=self._namespace, + user_key=encoded, + ) + + def put(self, map_key: MK, value: MV) -> None: + ck = self._complex_key(map_key) + self._factory._store.put(ck, self._factory._map_value_codec.encode(value)) + + def get(self, map_key: MK) -> Tuple[Optional[MV], bool]: + """Return (value, found). found is False when the key is missing.""" + ck = self._complex_key(map_key) + raw = self._factory._store.get(ck) + if raw is None: + return (None, False) + return (self._factory._map_value_codec.decode(raw), True) + + def delete(self, map_key: MK) -> None: + ck = self._complex_key(map_key) + self._factory._store.delete(ck) + + def clear(self) -> None: + """Remove all entries in this map (delete by prefix).""" + prefix_ck = ComplexKey( + key_group=self._factory._key_group, + key=self._primary_key, + namespace=self._namespace, + user_key=b"", + ) + self._factory._store.delete_prefix(prefix_ck) + + def all(self) -> Iterator[Tuple[MK, MV]]: + """Iterate over all (map_key, value) pairs in this map. Skips entries that fail to decode.""" + it = self._factory._store.scan_complex( + self._factory._key_group, + self._primary_key, + self._namespace, + ) + try: + while it.has_next(): + item = it.next() + if item is None: + break + key_raw, value_raw = item + try: + k = self._factory._map_key_codec.decode(key_raw) + except Exception: + continue + try: + v = self._factory._map_value_codec.decode(value_raw) + except Exception: + continue + yield (k, v) + finally: + if hasattr(it, "close"): + it.close() + + +__all__ = ["KeyedMapState", "KeyedMapStateFactory"] diff --git a/python/functionstream-api-advanced/src/fs_api_advanced/keyed/keyed_priority_queue_state.py b/python/functionstream-api-advanced/src/fs_api_advanced/keyed/keyed_priority_queue_state.py new file mode 100644 index 00000000..ea0d3dae --- /dev/null +++ b/python/functionstream-api-advanced/src/fs_api_advanced/keyed/keyed_priority_queue_state.py @@ -0,0 +1,175 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Any, Generic, Iterator, Optional, Tuple, TypeVar + +from fs_api.store import ComplexKey, KvError, KvStore +from fs_api_advanced.codec import Codec, default_codec_for + +from ._keyed_common import ensure_ordered_key_codec + +V = TypeVar("V") + + +class KeyedPriorityQueueStateFactory(Generic[V]): + """Factory for keyed priority queue state. Create from context with key_group; obtain queue per (primary_key, namespace).""" + + def __init__( + self, + store: KvStore, + key_group: bytes, + value_codec: Codec[V], + ): + if store is None: + raise KvError("keyed priority queue state factory store must not be None") + if key_group is None: + raise KvError("keyed priority queue state factory key_group must not be None") + if value_codec is None: + raise KvError("keyed priority queue state factory value codec must not be None") + ensure_ordered_key_codec(value_codec, "keyed priority queue value") + self._store = store + self._key_group = key_group + self._value_codec = value_codec + + @classmethod + def from_context( + cls, + ctx: Any, + store_name: str, + key_group: bytes, + item_codec: Codec[V], + ) -> "KeyedPriorityQueueStateFactory[V]": + """Create a KeyedPriorityQueueStateFactory from a context and store name (for keyed operators).""" + store = ctx.getOrCreateKVStore(store_name) + return cls(store, key_group, item_codec) + + @classmethod + def from_context_auto_codec( + cls, + ctx: Any, + store_name: str, + key_group: bytes, + item_type: Optional[type] = None, + ) -> "KeyedPriorityQueueStateFactory[V]": + """Create a KeyedPriorityQueueStateFactory with default value codec. V must have an ordered default codec.""" + store = ctx.getOrCreateKVStore(store_name) + if item_type is None: + raise KvError("keyed priority queue state from_context_auto_codec requires item_type") + codec = default_codec_for(item_type) + return cls(store, key_group, codec) + + def new_keyed_priority_queue( + self, primary_key: bytes, namespace: bytes + ) -> "KeyedPriorityQueueState[V]": + """Create a KeyedPriorityQueueState for the given primary key and namespace.""" + if primary_key is None: + raise KvError("keyed priority queue state primary_key must not be None") + if namespace is None: + raise KvError("keyed priority queue state namespace is required") + return KeyedPriorityQueueState(self, primary_key, namespace) + + +class KeyedPriorityQueueState(Generic[V]): + """Priority queue state for one (primary_key, namespace). Elements ordered by encoded user key. add/peek/poll/clear/all.""" + + def __init__( + self, + factory: KeyedPriorityQueueStateFactory[V], + primary_key: bytes, + namespace: bytes, + ): + if factory is None: + raise KvError("keyed priority queue state factory must not be None") + if primary_key is None: + raise KvError("keyed priority queue state primary_key must not be None") + if namespace is None: + raise KvError("keyed priority queue state namespace must not be None") + self._factory = factory + self._primary_key = primary_key + self._namespace = namespace + + def _complex_key(self, user_key: bytes) -> ComplexKey: + return ComplexKey( + key_group=self._factory._key_group, + key=self._primary_key, + namespace=self._namespace, + user_key=user_key, + ) + + def _prefix_ck(self) -> ComplexKey: + return ComplexKey( + key_group=self._factory._key_group, + key=self._primary_key, + namespace=self._namespace, + user_key=b"", + ) + + def add(self, value: V) -> None: + """Add an element (stored with encoded value as user key, empty value bytes).""" + user_key = self._factory._value_codec.encode(value) + self._factory._store.put(self._complex_key(user_key), b"") + + def peek(self) -> Tuple[Optional[V], bool]: + """Return (min element, found). found is False when queue is empty.""" + it = self._factory._store.scan_complex( + self._factory._key_group, + self._primary_key, + self._namespace, + ) + try: + if not it.has_next(): + return (None, False) + item = it.next() + if item is None: + return (None, False) + user_key, _ = item + return (self._factory._value_codec.decode(user_key), True) + finally: + if hasattr(it, "close"): + it.close() + + def poll(self) -> Tuple[Optional[V], bool]: + """Remove and return (min element, found). Same as peek then delete if found.""" + val, found = self.peek() + if not found: + return (val, found) + user_key = self._factory._value_codec.encode(val) + self._factory._store.delete(self._complex_key(user_key)) + return (val, True) + + def clear(self) -> None: + """Remove all elements in this queue (delete by prefix).""" + self._factory._store.delete_prefix(self._prefix_ck()) + + def all(self) -> Iterator[V]: + """Iterate over all elements in order. Skips entries that fail to decode.""" + it = self._factory._store.scan_complex( + self._factory._key_group, + self._primary_key, + self._namespace, + ) + try: + while it.has_next(): + item = it.next() + if item is None: + break + user_key, _ = item + try: + yield self._factory._value_codec.decode(user_key) + except Exception: + continue + finally: + if hasattr(it, "close"): + it.close() + + +__all__ = ["KeyedPriorityQueueState", "KeyedPriorityQueueStateFactory"] diff --git a/python/functionstream-api-advanced/src/fs_api_advanced/keyed/keyed_reducing_state.py b/python/functionstream-api-advanced/src/fs_api_advanced/keyed/keyed_reducing_state.py new file mode 100644 index 00000000..c429e6bc --- /dev/null +++ b/python/functionstream-api-advanced/src/fs_api_advanced/keyed/keyed_reducing_state.py @@ -0,0 +1,137 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Any, Callable, Generic, Optional, Tuple, TypeVar + +from fs_api.store import ComplexKey, KvError, KvStore +from fs_api_advanced.codec import Codec, default_codec_for + +V = TypeVar("V") + +# ReduceFunc(value1, value2) -> result. May raise on error (matches Go's (V, error) return). +ReduceFunc = Callable[[V, V], V] + + +class KeyedReducingStateFactory(Generic[V]): + """Factory for keyed reducing state. Create from context with key_group; obtain state per (primary_key, namespace).""" + + def __init__( + self, + store: KvStore, + key_group: bytes, + value_codec: Codec[V], + reduce_func: ReduceFunc[V], + ): + if store is None: + raise KvError("keyed reducing state factory store must not be None") + if key_group is None: + raise KvError("keyed reducing state factory key_group must not be None") + if value_codec is None or reduce_func is None: + raise KvError( + "keyed reducing state factory value_codec and reduce_func must not be None" + ) + self._store = store + self._key_group = key_group + self._value_codec = value_codec + self._reduce_func = reduce_func + + @classmethod + def from_context( + cls, + ctx: Any, + store_name: str, + key_group: bytes, + value_codec: Codec[V], + reduce_func: ReduceFunc[V], + ) -> "KeyedReducingStateFactory[V]": + """Create a KeyedReducingStateFactory from a context and store name (for keyed operators).""" + store = ctx.getOrCreateKVStore(store_name) + return cls(store, key_group, value_codec, reduce_func) + + @classmethod + def from_context_auto_codec( + cls, + ctx: Any, + store_name: str, + key_group: bytes, + reduce_func: ReduceFunc[V], + value_type: Optional[type] = None, + ) -> "KeyedReducingStateFactory[V]": + """Create a KeyedReducingStateFactory with default value codec from context and store name.""" + store = ctx.getOrCreateKVStore(store_name) + if value_type is None: + raise KvError("keyed reducing state from_context_auto_codec requires value_type") + codec = default_codec_for(value_type) + return cls(store, key_group, codec, reduce_func) + + def new_reducing_state( + self, primary_key: bytes, namespace: bytes + ) -> "KeyedReducingState[V]": + """Create a KeyedReducingState for the given primary key and namespace.""" + if primary_key is None: + raise KvError("keyed reducing state primary_key must not be None") + if namespace is None: + raise KvError("keyed reducing state namespace is required") + return KeyedReducingState(self, primary_key, namespace) + + +class KeyedReducingState(Generic[V]): + """Reducing state for one (primary_key, namespace). add(value) merges with reduce_func; get() returns current value; clear() removes.""" + + def __init__( + self, + factory: KeyedReducingStateFactory[V], + primary_key: bytes, + namespace: bytes, + ): + if factory is None: + raise KvError("keyed reducing state factory must not be None") + if primary_key is None: + raise KvError("keyed reducing state primary_key must not be None") + if namespace is None: + raise KvError("keyed reducing state namespace must not be None") + self._factory = factory + self._primary_key = primary_key + self._namespace = namespace + + def _complex_key(self) -> ComplexKey: + return ComplexKey( + key_group=self._factory._key_group, + key=self._primary_key, + namespace=self._namespace, + user_key=b"", + ) + + def add(self, value: V) -> None: + """Merge value into this state using the factory's reduce function (get current, reduce with value, put).""" + ck = self._complex_key() + raw = self._factory._store.get(ck) + if raw is None: + result = value + else: + old_value = self._factory._value_codec.decode(raw) + result = self._factory._reduce_func(old_value, value) + self._factory._store.put(ck, self._factory._value_codec.encode(result)) + + def get(self) -> Tuple[Optional[V], bool]: + """Return (current value, found). found is False when no value has been set.""" + raw = self._factory._store.get(self._complex_key()) + if raw is None: + return (None, False) + return (self._factory._value_codec.decode(raw), True) + + def clear(self) -> None: + """Remove the stored value for this state.""" + self._factory._store.delete(self._complex_key()) + + +__all__ = ["ReduceFunc", "KeyedReducingState", "KeyedReducingStateFactory"] diff --git a/python/functionstream-api-advanced/src/fs_api_advanced/keyed/keyed_value_state.py b/python/functionstream-api-advanced/src/fs_api_advanced/keyed/keyed_value_state.py new file mode 100644 index 00000000..fe050665 --- /dev/null +++ b/python/functionstream-api-advanced/src/fs_api_advanced/keyed/keyed_value_state.py @@ -0,0 +1,122 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Any, Generic, Optional, Tuple, TypeVar + +from fs_api.store import ComplexKey, KvError, KvStore +from fs_api_advanced.codec import Codec, default_codec_for + +V = TypeVar("V") + + +class KeyedValueStateFactory(Generic[V]): + """Factory for keyed value state. Create from context with key_group; obtain state via new_keyed_value(primary_key, namespace).""" + + def __init__( + self, + store: KvStore, + key_group: bytes, + value_codec: Codec[V], + ): + if store is None: + raise KvError("keyed value state factory store must not be None") + if key_group is None: + raise KvError("keyed value state factory key_group must not be None") + if value_codec is None: + raise KvError("keyed value state factory value codec must not be None") + self._store = store + self._key_group = key_group + self._value_codec = value_codec + + @classmethod + def from_context( + cls, + ctx: Any, + store_name: str, + key_group: bytes, + value_codec: Codec[V], + ) -> "KeyedValueStateFactory[V]": + """Create a KeyedValueStateFactory from a context and store name (for keyed operators).""" + store = ctx.getOrCreateKVStore(store_name) + return cls(store, key_group, value_codec) + + @classmethod + def from_context_auto_codec( + cls, + ctx: Any, + store_name: str, + key_group: bytes, + value_type: Optional[type] = None, + ) -> "KeyedValueStateFactory[V]": + """Create a KeyedValueStateFactory with default value codec from context and store name.""" + store = ctx.getOrCreateKVStore(store_name) + if value_type is None: + raise KvError("keyed value state from_context_auto_codec requires value_type") + codec = default_codec_for(value_type) + return cls(store, key_group, codec) + + def new_keyed_value( + self, primary_key: bytes, namespace: bytes + ) -> "KeyedValueState[V]": + """Create a KeyedValueState for the given primary key and namespace.""" + if primary_key is None: + raise KvError("keyed value state primary_key must not be None") + if namespace is None: + raise KvError("keyed value state namespace is required") + return KeyedValueState(self, primary_key, namespace) + + +class KeyedValueState(Generic[V]): + """Value state for one (primary_key, namespace). update(value), value() -> (value, found), clear().""" + + def __init__( + self, + factory: KeyedValueStateFactory[V], + primary_key: bytes, + namespace: bytes, + ): + if factory is None: + raise KvError("keyed value state factory must not be None") + if primary_key is None: + raise KvError("keyed value state primary_key must not be None") + if namespace is None: + raise KvError("keyed value state namespace must not be None") + self._factory = factory + self._primary_key = primary_key + self._namespace = namespace + + def _complex_key(self) -> ComplexKey: + return ComplexKey( + key_group=self._factory._key_group, + key=self._primary_key, + namespace=self._namespace, + user_key=b"", + ) + + def update(self, value: V) -> None: + """Set the value for this state.""" + ck = self._complex_key() + self._factory._store.put(ck, self._factory._value_codec.encode(value)) + + def value(self) -> Tuple[Optional[V], bool]: + """Return (current value, found). found is False when no value has been set.""" + raw = self._factory._store.get(self._complex_key()) + if raw is None: + return (None, False) + return (self._factory._value_codec.decode(raw), True) + + def clear(self) -> None: + """Remove the stored value for this state.""" + self._factory._store.delete(self._complex_key()) + + +__all__ = ["KeyedValueState", "KeyedValueStateFactory"] diff --git a/python/functionstream-api-advanced/src/fs_api_advanced/structures/__init__.py b/python/functionstream-api-advanced/src/fs_api_advanced/structures/__init__.py new file mode 100644 index 00000000..9dce8ee3 --- /dev/null +++ b/python/functionstream-api-advanced/src/fs_api_advanced/structures/__init__.py @@ -0,0 +1,37 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from .value_state import ValueState +from .list_state import ListState +from .map_state import ( + MapEntry, + MapState, + infer_ordered_key_codec, + create_map_state_auto_key_codec, +) +from .priority_queue_state import PriorityQueueState +from .aggregating_state import AggregateFunc, AggregatingState +from .reducing_state import ReduceFunc, ReducingState + +__all__ = [ + "ValueState", + "ListState", + "MapEntry", + "MapState", + "infer_ordered_key_codec", + "create_map_state_auto_key_codec", + "PriorityQueueState", + "AggregateFunc", + "AggregatingState", + "ReduceFunc", + "ReducingState", +] diff --git a/python/functionstream-api-advanced/src/fs_api_advanced/structures/aggregating_state.py b/python/functionstream-api-advanced/src/fs_api_advanced/structures/aggregating_state.py new file mode 100644 index 00000000..8f02c53b --- /dev/null +++ b/python/functionstream-api-advanced/src/fs_api_advanced/structures/aggregating_state.py @@ -0,0 +1,103 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Any, Generic, Optional, Protocol, Tuple, TypeVar + +from fs_api.store import ComplexKey, KvError, KvStore +from fs_api_advanced.codec import Codec, PickleCodec + +T = TypeVar("T") +ACC = TypeVar("ACC") +R = TypeVar("R") + + +class AggregateFunc(Protocol[T, ACC, R]): + def create_accumulator(self) -> ACC: + ... + + def add(self, value: T, accumulator: ACC) -> ACC: + ... + + def get_result(self, accumulator: ACC) -> R: + ... + + def merge(self, a: ACC, b: ACC) -> ACC: + ... + + +class AggregatingState(Generic[T, ACC, R]): + def __init__( + self, + store: KvStore, + acc_codec: Codec[ACC], + agg_func: AggregateFunc[T, ACC, R], + ): + if store is None: + raise KvError("aggregating state store must not be None") + if acc_codec is None: + raise KvError("aggregating state acc codec must not be None") + if agg_func is None: + raise KvError("aggregating state agg func must not be None") + self._store = store + self._acc_codec = acc_codec + self._agg_func = agg_func + self._ck = ComplexKey( + key_group=b"", + key=b"", + namespace=b"", + user_key=b"", + ) + + @classmethod + def from_context( + cls, + ctx: Any, + store_name: str, + acc_codec: Codec[ACC], + agg_func: AggregateFunc[T, ACC, R], + ) -> "AggregatingState[T, ACC, R]": + """Create an AggregatingState from a context and store name.""" + store = ctx.getOrCreateKVStore(store_name) + return cls(store, acc_codec, agg_func) + + @classmethod + def from_context_auto_codec( + cls, + ctx: Any, + store_name: str, + agg_func: AggregateFunc[T, ACC, R], + ) -> "AggregatingState[T, ACC, R]": + """Create an AggregatingState with default (pickle) accumulator codec from context and store name.""" + store = ctx.getOrCreateKVStore(store_name) + return cls(store, PickleCodec(), agg_func) + + def add(self, value: T) -> None: + raw = self._store.get(self._ck) + if raw is None: + acc = self._agg_func.create_accumulator() + else: + acc = self._acc_codec.decode(raw) + new_acc = self._agg_func.add(value, acc) + self._store.put(self._ck, self._acc_codec.encode(new_acc)) + + def get(self) -> Tuple[Optional[R], bool]: + raw = self._store.get(self._ck) + if raw is None: + return (None, False) + acc = self._acc_codec.decode(raw) + return (self._agg_func.get_result(acc), True) + + def clear(self) -> None: + self._store.delete(self._ck) + + +__all__ = ["AggregateFunc", "AggregatingState"] diff --git a/python/functionstream-api-advanced/src/fs_api_advanced/structures/list_state.py b/python/functionstream-api-advanced/src/fs_api_advanced/structures/list_state.py new file mode 100644 index 00000000..3ae9d41a --- /dev/null +++ b/python/functionstream-api-advanced/src/fs_api_advanced/structures/list_state.py @@ -0,0 +1,102 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import struct +from typing import Generic, List, TypeVar, Any + +from fs_api.store import ComplexKey, KvError, KvStore +from fs_api_advanced.codec import Codec, PickleCodec + +T = TypeVar("T") + + +class ListState(Generic[T]): + def __init__(self, store: KvStore, codec: Codec[T]): + if store is None: + raise KvError("list state store must not be None") + if codec is None: + raise KvError("list state codec must not be None") + self._store = store + self._codec = codec + self._ck = ComplexKey( + key_group=b"", + key=b"", + namespace=b"", + user_key=b"", + ) + + @classmethod + def from_context(cls, ctx: Any, store_name: str, codec: Codec[T]) -> "ListState[T]": + """Create a ListState from a context and store name.""" + store = ctx.getOrCreateKVStore(store_name) + return cls(store, codec) + + @classmethod + def from_context_auto_codec(cls, ctx: Any, store_name: str) -> "ListState[T]": + """Create a ListState with default (pickle) codec from context and store name.""" + store = ctx.getOrCreateKVStore(store_name) + return cls(store, PickleCodec()) + + def add(self, value: T) -> None: + payload = self._serialize_one(value) + self._store.merge(self._ck, payload) + + def add_all(self, values: List[T]) -> None: + if not values: + return + payload = self._serialize_batch(values) + self._store.merge(self._ck, payload) + + def get(self) -> List[T]: + raw = self._store.get(self._ck) + if raw is None: + return [] + return self._deserialize(raw) + + def update(self, values: List[T]) -> None: + if len(values) == 0: + self.clear() + return + payload = self._serialize_batch(values) + self._store.put(self._ck, payload) + + def clear(self) -> None: + self._store.delete(self._ck) + + def _serialize_one(self, value: T) -> bytes: + encoded = self._codec.encode(value) + return struct.pack(">I", len(encoded)) + encoded + + def _serialize_batch(self, values: List[T]) -> bytes: + parts: List[bytes] = [] + for v in values: + encoded = self._codec.encode(v) + parts.append(struct.pack(">I", len(encoded)) + encoded) + return b"".join(parts) + + def _deserialize(self, raw: bytes) -> List[T]: + out: List[T] = [] + idx = 0 + while idx < len(raw): + if len(raw) - idx < 4: + raise KvError("corrupted list payload: truncated length") + (item_len,) = struct.unpack(">I", raw[idx : idx + 4]) + idx += 4 + if item_len < 0 or len(raw) - idx < item_len: + raise KvError("corrupted list payload: invalid element length") + item_raw = raw[idx : idx + item_len] + idx += item_len + out.append(self._codec.decode(item_raw)) + return out + + +__all__ = ["ListState"] diff --git a/python/functionstream-api-advanced/src/fs_api_advanced/structures/map_state.py b/python/functionstream-api-advanced/src/fs_api_advanced/structures/map_state.py new file mode 100644 index 00000000..2d129b30 --- /dev/null +++ b/python/functionstream-api-advanced/src/fs_api_advanced/structures/map_state.py @@ -0,0 +1,128 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from dataclasses import dataclass +from typing import Any, Generic, Iterator, Optional, Tuple, Type, TypeVar + +from fs_api.store import ComplexKey, KvError, KvStore +from fs_api_advanced.codec import Codec, default_codec_for + +K = TypeVar("K") +V = TypeVar("V") + + +@dataclass +class MapEntry(Generic[K, V]): + key: K + value: V + + +class MapState(Generic[K, V]): + def __init__(self, store: KvStore, key_codec: Codec[K], value_codec: Codec[V]): + if store is None: + raise KvError("map state store must not be None") + if key_codec is None or value_codec is None: + raise KvError("map state codecs must not be None") + if not getattr(key_codec, "supports_ordered_keys", False): + raise KvError("map state key codec must support ordered key encoding for range scans") + self._store = store + self._key_codec = key_codec + self._value_codec = value_codec + self._key_group = b"" + self._key = b"" + self._namespace = b"" + + @classmethod + def with_auto_key_codec( + cls, + store: KvStore, + key_type: Type[K], + value_codec: Codec[V], + ) -> "MapState[K, V]": + key_codec = default_codec_for(key_type) + return cls(store, key_codec, value_codec) + + @classmethod + def from_context( + cls, + ctx: Any, + store_name: str, + key_codec: Codec[K], + value_codec: Codec[V], + ) -> "MapState[K, V]": + """Create a MapState from a context and store name.""" + store = ctx.getOrCreateKVStore(store_name) + return cls(store, key_codec, value_codec) + + @classmethod + def from_context_auto_key_codec( + cls, + ctx: Any, + store_name: str, + value_codec: Codec[V], + ) -> "MapState[K, V]": + """Create a MapState with default (bytes) key codec from context and store name.""" + from fs_api_advanced.codec import BytesCodec + store = ctx.getOrCreateKVStore(store_name) + return cls(store, BytesCodec(), value_codec) + + def put(self, key: K, value: V) -> None: + encoded_key = self._key_codec.encode(key) + encoded_value = self._value_codec.encode(value) + self._store.put(self._ck(encoded_key), encoded_value) + + def get(self, key: K) -> Optional[V]: + encoded_key = self._key_codec.encode(key) + raw = self._store.get(self._ck(encoded_key)) + if raw is None: + return None + return self._value_codec.decode(raw) + + def delete(self, key: K) -> None: + encoded_key = self._key_codec.encode(key) + self._store.delete(self._ck(encoded_key)) + + def clear(self) -> None: + self._store.delete_prefix(self._ck(b"")) + + def all(self) -> Iterator[Tuple[K, V]]: + it = self._store.scan_complex(self._key_group, self._key, self._namespace) + while it.has_next(): + item = it.next() + if item is None: + break + key_bytes, value_bytes = item + yield self._key_codec.decode(key_bytes), self._value_codec.decode(value_bytes) + + def _ck(self, user_key: bytes) -> ComplexKey: + return ComplexKey( + key_group=self._key_group, + key=self._key, + namespace=self._namespace, + user_key=user_key, + ) + + +def create_map_state_auto_key_codec( + store: KvStore, + key_type: Type[K], + value_codec: Codec[V], +) -> MapState[K, V]: + return MapState.with_auto_key_codec(store, key_type, value_codec) + + +def infer_ordered_key_codec(key_type: Type[Any]) -> Codec[Any]: + """Return an ordered key codec for the given key type (uses default_codec_for).""" + return default_codec_for(key_type) + + +__all__ = ["MapEntry", "MapState", "create_map_state_auto_key_codec", "infer_ordered_key_codec"] diff --git a/python/functionstream-api-advanced/src/fs_api_advanced/structures/priority_queue_state.py b/python/functionstream-api-advanced/src/fs_api_advanced/structures/priority_queue_state.py new file mode 100644 index 00000000..5b8bf4a8 --- /dev/null +++ b/python/functionstream-api-advanced/src/fs_api_advanced/structures/priority_queue_state.py @@ -0,0 +1,94 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Any, Generic, Iterator, Optional, Tuple, TypeVar + +from fs_api.store import ComplexKey, KvError, KvStore +from fs_api_advanced.codec import Codec, IntCodec + +T = TypeVar("T") + + +class PriorityQueueState(Generic[T]): + """State for a priority queue. codec must support ordered key encoding (supports_ordered_keys=True).""" + + def __init__(self, store: KvStore, codec: Codec[T]): + if store is None: + raise KvError("priority queue store must not be None") + if codec is None: + raise KvError("priority queue codec must not be None") + if not getattr(codec, "supports_ordered_keys", False): + raise KvError("priority queue codec must support ordered key encoding") + self._store = store + self._codec = codec + self._key_group = b"" + self._key = b"" + self._namespace = b"" + + @classmethod + def from_context(cls, ctx: Any, store_name: str, codec: Codec[T]) -> "PriorityQueueState[T]": + """Create a PriorityQueueState from a context and store name.""" + store = ctx.getOrCreateKVStore(store_name) + return cls(store, codec) + + @classmethod + def from_context_auto_codec(cls, ctx: Any, store_name: str) -> "PriorityQueueState[T]": + """Create a PriorityQueueState with default (int) codec from context and store name.""" + store = ctx.getOrCreateKVStore(store_name) + return cls(store, IntCodec()) + + def _ck(self, user_key: bytes) -> ComplexKey: + return ComplexKey( + key_group=self._key_group, + key=self._key, + namespace=self._namespace, + user_key=user_key, + ) + + def add(self, value: T) -> None: + user_key = self._codec.encode(value) + self._store.put(self._ck(user_key), b"") + + def peek(self) -> Tuple[Optional[T], bool]: + it = self._store.scan_complex(self._key_group, self._key, self._namespace) + if not it.has_next(): + return (None, False) + item = it.next() + if item is None: + return (None, False) + user_key, _ = item + return (self._codec.decode(user_key), True) + + def poll(self) -> Tuple[Optional[T], bool]: + val, found = self.peek() + if not found or val is None: + return (val, found) + user_key = self._codec.encode(val) + self._store.delete(self._ck(user_key)) + return (val, True) + + def clear(self) -> None: + self._store.delete_prefix( + ComplexKey(key_group=self._key_group, key=self._key, namespace=self._namespace, user_key=b"") + ) + + def all(self) -> Iterator[T]: + it = self._store.scan_complex(self._key_group, self._key, self._namespace) + while it.has_next(): + item = it.next() + if item is None: + break + user_key, _ = item + yield self._codec.decode(user_key) + + +__all__ = ["PriorityQueueState"] diff --git a/python/functionstream-api-advanced/src/fs_api_advanced/structures/reducing_state.py b/python/functionstream-api-advanced/src/fs_api_advanced/structures/reducing_state.py new file mode 100644 index 00000000..b1912517 --- /dev/null +++ b/python/functionstream-api-advanced/src/fs_api_advanced/structures/reducing_state.py @@ -0,0 +1,81 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Any, Callable, Generic, Optional, Tuple, TypeVar + +from fs_api.store import ComplexKey, KvError, KvStore +from fs_api_advanced.codec import Codec, PickleCodec + +V = TypeVar("V") + +ReduceFunc = Callable[[V, V], V] + + +class ReducingState(Generic[V]): + def __init__(self, store: KvStore, value_codec: Codec[V], reduce_func: ReduceFunc[V]): + if store is None: + raise KvError("reducing state store must not be None") + if value_codec is None or reduce_func is None: + raise KvError("reducing state value codec and reduce function are required") + self._store = store + self._value_codec = value_codec + self._reduce_func = reduce_func + self._ck = ComplexKey( + key_group=b"", + key=b"", + namespace=b"", + user_key=b"", + ) + + @classmethod + def from_context( + cls, + ctx: Any, + store_name: str, + value_codec: Codec[V], + reduce_func: ReduceFunc[V], + ) -> "ReducingState[V]": + """Create a ReducingState from a context and store name.""" + store = ctx.getOrCreateKVStore(store_name) + return cls(store, value_codec, reduce_func) + + @classmethod + def from_context_auto_codec( + cls, + ctx: Any, + store_name: str, + reduce_func: ReduceFunc[V], + ) -> "ReducingState[V]": + """Create a ReducingState with default (pickle) value codec from context and store name.""" + store = ctx.getOrCreateKVStore(store_name) + return cls(store, PickleCodec(), reduce_func) + + def add(self, value: V) -> None: + raw = self._store.get(self._ck) + if raw is None: + result = value + else: + old_value = self._value_codec.decode(raw) + result = self._reduce_func(old_value, value) + self._store.put(self._ck, self._value_codec.encode(result)) + + def get(self) -> Tuple[Optional[V], bool]: + raw = self._store.get(self._ck) + if raw is None: + return (None, False) + return (self._value_codec.decode(raw), True) + + def clear(self) -> None: + self._store.delete(self._ck) + + +__all__ = ["ReduceFunc", "ReducingState"] diff --git a/python/functionstream-api-advanced/src/fs_api_advanced/structures/value_state.py b/python/functionstream-api-advanced/src/fs_api_advanced/structures/value_state.py new file mode 100644 index 00000000..56e4971e --- /dev/null +++ b/python/functionstream-api-advanced/src/fs_api_advanced/structures/value_state.py @@ -0,0 +1,66 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from typing import Any, Generic, Optional, Tuple, TypeVar + +from fs_api.store import ComplexKey, KvError, KvStore +from fs_api_advanced.codec import Codec, PickleCodec + +T = TypeVar("T") + + +class ValueState(Generic[T]): + def __init__(self, store: KvStore, codec: Codec[T]): + if store is None: + raise KvError("value state store must not be None") + if codec is None: + raise KvError("value state codec must not be None") + self._store = store + self._codec = codec + self._ck = ComplexKey( + key_group=b"", + key=b"", + namespace=b"", + user_key=b"", + ) + + @classmethod + def from_context( + cls, + ctx: Any, + store_name: str, + codec: Codec[T], + ) -> "ValueState[T]": + """Create a ValueState from a context and store name (same as ctx.getOrCreateValueState(store_name, codec)).""" + store = ctx.getOrCreateKVStore(store_name) + return cls(store, codec) + + @classmethod + def from_context_auto_codec(cls, ctx: Any, store_name: str) -> "ValueState[T]": + """Create a ValueState with default (pickle) codec from context and store name.""" + store = ctx.getOrCreateKVStore(store_name) + return cls(store, PickleCodec()) + + def update(self, value: T) -> None: + self._store.put(self._ck, self._codec.encode(value)) + + def value(self) -> Optional[T]: + raw = self._store.get(self._ck) + if raw is None: + return (None, False) + return (self._codec.decode(raw), True) + + def clear(self) -> None: + self._store.delete(self._ck) + + +__all__ = ["ValueState"] diff --git a/python/functionstream-api/pyproject.toml b/python/functionstream-api/pyproject.toml index b7c107c6..adc2651b 100644 --- a/python/functionstream-api/pyproject.toml +++ b/python/functionstream-api/pyproject.toml @@ -25,7 +25,9 @@ dependencies = [ "cloudpickle>=2.0.0", ] -[tool.setuptools] -package-dir = {"" = "src"} -packages = ["fs_api", "fs_api.store"] +[tool.setuptools.packages.find] +where = ["src"] + +[tool.setuptools.package-dir] +"" = "src" diff --git a/python/functionstream-api/src/fs_api/__init__.py b/python/functionstream-api/src/fs_api/__init__.py index 1ecdd42d..365b29fc 100644 --- a/python/functionstream-api/src/fs_api/__init__.py +++ b/python/functionstream-api/src/fs_api/__init__.py @@ -36,4 +36,3 @@ "KvIterator", "KvStore", ] - diff --git a/python/functionstream-api/src/fs_api/context.py b/python/functionstream-api/src/fs_api/context.py index 52cbd8a5..2244ecad 100644 --- a/python/functionstream-api/src/fs_api/context.py +++ b/python/functionstream-api/src/fs_api/context.py @@ -5,23 +5,19 @@ # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on "AS IS" BASIS, +# distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. -""" -fs_api.context - -Context: Context object -""" import abc from typing import Dict + from .store import KvStore class Context(abc.ABC): - """Context object""" + """Low-level processor context. For Codec, ValueState and other state types use functionstream-api-advanced.""" @abc.abstractmethod def emit(self, data: bytes, channel: int = 0): @@ -37,11 +33,7 @@ def getOrCreateKVStore(self, name: str) -> KvStore: @abc.abstractmethod def getConfig(self) -> Dict[str, str]: - """ - Get global configuration Map + pass - Returns: - Dict[str, str]: Configuration dictionary - """ -__all__ = ['Context'] +__all__ = ["Context"] diff --git a/python/functionstream-api/src/fs_api/store/__init__.py b/python/functionstream-api/src/fs_api/store/__init__.py index 1cfdc8f8..dc415224 100644 --- a/python/functionstream-api/src/fs_api/store/__init__.py +++ b/python/functionstream-api/src/fs_api/store/__init__.py @@ -5,11 +5,16 @@ # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on "AS IS" BASIS, +# distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. +""" +Low-level store API: KvStore, errors, ComplexKey, KvIterator only. + +For Codec, ValueState, ListState, MapState, Keyed* states use functionstream-api-advanced. +""" from .error import KvError, KvNotFoundError, KvIOError, KvOtherError from .complexkey import ComplexKey @@ -17,12 +22,11 @@ from .store import KvStore __all__ = [ - 'KvError', - 'KvNotFoundError', - 'KvIOError', - 'KvOtherError', - 'ComplexKey', - 'KvIterator', - 'KvStore', + "KvError", + "KvNotFoundError", + "KvIOError", + "KvOtherError", + "ComplexKey", + "KvIterator", + "KvStore", ] - diff --git a/python/functionstream-runtime/Makefile b/python/functionstream-runtime/Makefile index 748f3638..b8426fcb 100644 --- a/python/functionstream-runtime/Makefile +++ b/python/functionstream-runtime/Makefile @@ -88,8 +88,9 @@ venv: $(PYTHON) install-deps: venv $(call log_info, Installing build dependencies...) @$(PIP) install componentize-py - @# [FIX 3] Use PYTHON_ROOT to find the sibling 'functionstream-api' + @# [FIX 3] Use PYTHON_ROOT to find the sibling 'functionstream-api' and 'functionstream-api-advanced' @$(PIP) install -e $(PYTHON_ROOT)/functionstream-api + @$(PIP) install -e $(PYTHON_ROOT)/functionstream-api-advanced @$(call log_success, Dependencies installed.) # Build the wasm component diff --git a/python/functionstream-runtime/build.py b/python/functionstream-runtime/build.py index e77d8552..f816c12f 100755 --- a/python/functionstream-runtime/build.py +++ b/python/functionstream-runtime/build.py @@ -39,6 +39,7 @@ WASM_OUTPUT = TARGET_DIR / "functionstream-python-runtime.wasm" FS_API_DIR = SCRIPT_DIR.parent / "functionstream-api" +FS_API_ADVANCED_DIR = SCRIPT_DIR.parent / "functionstream-api-advanced" WORLD_NAME = "processor-runtime" MAIN_MODULE = "fs_runtime.runner" @@ -114,6 +115,10 @@ def prepare_dependencies(self): self.python_exec, "-m", "pip", "install", "--target", str(DEPENDENCIES_DIR), str(FS_API_DIR) ], check=True, capture_output=True) + subprocess.run([ + self.python_exec, "-m", "pip", "install", + "--target", str(DEPENDENCIES_DIR), "--no-deps", str(FS_API_ADVANCED_DIR) + ], check=True, capture_output=True) def build_wasm(self): TARGET_DIR.mkdir(parents=True, exist_ok=True) diff --git a/python/functionstream-runtime/pyproject.toml b/python/functionstream-runtime/pyproject.toml index 7fa91c5d..a0ba4805 100644 --- a/python/functionstream-runtime/pyproject.toml +++ b/python/functionstream-runtime/pyproject.toml @@ -19,6 +19,7 @@ classifiers = [ ] dependencies = [ "functionstream-api>=0.0.1", + "functionstream-api-advanced>=0.1.0", "cloudpickle>=2.0.0", ] diff --git a/python/functionstream-runtime/src/fs_runtime/runner.py b/python/functionstream-runtime/src/fs_runtime/runner.py index c1d7f9d7..5b9b5735 100644 --- a/python/functionstream-runtime/src/fs_runtime/runner.py +++ b/python/functionstream-runtime/src/fs_runtime/runner.py @@ -21,6 +21,7 @@ logger = logging.getLogger(__name__) from .store.fs_context import WitContext, convert_config_to_dict +import fs_api_advanced _DRIVER: Optional[FSProcessorDriver] = None diff --git a/python/functionstream-runtime/src/fs_runtime/store/fs_context.py b/python/functionstream-runtime/src/fs_runtime/store/fs_context.py index 0dd9d624..0f247e36 100644 --- a/python/functionstream-runtime/src/fs_runtime/store/fs_context.py +++ b/python/functionstream-runtime/src/fs_runtime/store/fs_context.py @@ -60,6 +60,5 @@ def getOrCreateKVStore(self, name: str) -> KvStore: def getConfig(self) -> Dict[str, str]: return self._CONFIG.copy() - __all__ = ['WitContext', 'convert_config_to_dict'] diff --git a/scripts/setup.sh b/scripts/setup.sh index fbb0fadb..492556bf 100755 --- a/scripts/setup.sh +++ b/scripts/setup.sh @@ -100,6 +100,7 @@ log_ok log_step "API_INSTALL" "$PIP" install -q -e "$PYTHON_ROOT/functionstream-api" +"$PIP" install -q -e "$PYTHON_ROOT/functionstream-api-advanced" log_ok log_step "CLIENT_CODEGEN" @@ -130,7 +131,7 @@ log_ok log_step "WASM_BUILD" TARGET_DIR="$ROOT_DIR/data/cache/python-runner" mkdir -p "$TARGET_DIR" -(cd "$PYTHON_ROOT/functionstream-runtime" && PYTHONPATH="$PYTHON_ROOT/functionstream-api" "$PYTHON_BIN" build.py > /dev/null) +(cd "$PYTHON_ROOT/functionstream-runtime" && PYTHONPATH="$PYTHON_ROOT/functionstream-api:$PYTHON_ROOT/functionstream-api-advanced" "$PYTHON_BIN" build.py > /dev/null) WASM_SRC="$PYTHON_ROOT/functionstream-runtime/target/functionstream-python-runtime.wasm" if [ -f "$WASM_SRC" ]; then