diff --git a/docs/DATA EVENTS/data-events-reference/data-events-inference.md b/docs/DATA EVENTS/data-events-reference/data-events-inference.md index f11e0da..6ea86f0 100644 --- a/docs/DATA EVENTS/data-events-reference/data-events-inference.md +++ b/docs/DATA EVENTS/data-events-reference/data-events-inference.md @@ -13,68 +13,178 @@ next: ## Description -The `INFERENCE` function performs inference on a given input, typically an image, using a specified machine learning model. It processes the input data and returns the results in a format suitable for further analysis or display within your application. This is useful for integrating AI-driven features, such as object detection, classification, or image recognition, into your application components. - -> ⚠️ **Upcoming Change: INFERENCE Data Event Library Update** -> We'll be updating the underlying library powering the INFERENCE data event in a future release. If your workflows depend on this event, we'd like to hear from you before the change goes out. Please reach out to [product@fulcrumapp.com](mailto:product@fulcrumapp.com) with any questions or concerns. +The `INFERENCE` function performs on-device machine learning or generative AI inference using a specified model. It supports computer vision tasks (such as image classification, object detection, or image recognition) and generative text tasks (such as summarization, assistant chats, or text classification) directly on the mobile device. **THIS FUNCTION WORKS ON MOBILE DEVICES, BUT NOT IN THE WEB RECORD EDITOR** +> ⚠️ **Device Resource & Battery Usage Warning** +> On-device model inference is highly resource-intensive and will consume substantial battery and memory. Requirements scale directly with the size of the loaded model. +> +> **Generative LLMs** are especially demanding; consider limiting them to modern flagship devices and/or documenting minimum device requirements (RAM/SoC) for your users. + +## Execution Modes + +The execution mode determines how the system runs the model. It supports three modes: + +1. **Vision ML**: Used for on-device computer vision tasks (such as image classification, object detection, or image recognition). +2. **Generative LLM**: Used for on-device generative text tasks (such as summarization, assistant chats, or text classification). +3. **Legacy Vision ML (ONNX - Deprecated)**: Fallback execution when `options.config` is omitted. **Support for ONNX is deprecated. Please upgrade to modern configurations.** + +> ⚠️ **Model Type Auto-Detection** +> +> The model type is determined **strictly by the file extension** of the model file passed to `options.model`. +> +> Auto-detection is **not** determined or overridden by the parameters passed inside `options.config`. However, **the parameters in `options.config` must match the auto-detected model type** (e.g., providing a `size` parameter for a Vision ML model, or a `prompt` parameter for a Generative LLM). + + +--- + +## Model Resolution & Supported File Extensions + +The `options.model` parameter accepts a string representing the model filename uploaded to the reference files. + +### Supported File Extensions & Model Types + +The system detects the correct machine learning engine to use based on the file extension of the model: + +| File Extension | Detected Model Type | Typical Use Cases | +| :--- | :--- | :--- | +| **`.tflite`** | **Vision ML** | Image classification, object detection, image recognition | +| **`.gguf`**, **`.litertlm`**, **`.task`** | **Generative LLM** | Text generation, text summarization, assistant chats, text classification | + +### Model Loading + +If you bundle custom models as form reference files (e.g., `mobilenet.tflite` or `gemma.gguf`), pass the exact filename (including extension) as the `options.model` string. + +--- + ## Parameters +### Common Parameters * `options` object (required) - An object containing the parameters for the function. + * `model` string (required) - The exact model filename uploaded to the form's reference files to be loaded. + * `form_id` string (optional) - The identifier of the form (defaults to current form). + * `form_name` string (optional) - The name of the form. + +--- + +### Mode 1: Vision ML (for `.tflite` models) +*Used for running image classification, object detection, and other computer vision models.* + +* `options` object: * `photo_id` string (required) - The identifier of the photo to be processed. - * `model` object (required) - The machine learning model to be used for inference. - * `size` number (optional) - The size to which the input should be resized before inference. Default is 640. - * `format` string (optional) - The format of the input data. Can be either 'chw' (channels, height, width) or 'hwc' (height, width, channels). Choose the format based on how the model was exported. Default is 'chw'. - * `type` string (optional) - The data type of the input. Default is 'float'. - * `mean` array (optional) - The mean values for normalizing the input data. Default is `[0.485, 0.456, 0.406]`. - * `std` array (optional) - The standard deviation values for normalizing the input data. Default is `[0.229, 0.224, 0.225]`. + * `config` object (required) - Configuration for the computer vision engine: + * `size` number (required) - The input image will be resized to a square before passing it to the model. `size` is the size of a side. It must be greater than 0 and it should match what the model expects. + * `format` string (optional) - The format of the input image data. Either `'chw'` (channels, height, width) or `'hwc'` (height, width, channels). + * `inputType` string (optional) - The data type of the input model. Either `'int8'` or `'float'`. + * `mean` array (optional) - An array of exactly 3 numbers for normalizing the input data (e.g. `[0.485, 0.456, 0.406]`). + * `std` array (optional) - An array of exactly 3 numbers for normalization standard deviations (e.g. `[0.229, 0.224, 0.225]`). -* `callback` function (required) - A function to be executed after the inference is completed. It receives two parameters: - * `error` object - Contains information if an error occurs during inference. - * `result` object - Contains the outputs of the inference. +--- + +### Mode 2: Generative LLM (for `.gguf`, `.litertlm`, and `.task` models) +*Used for running on-device generative AI large language models.* + +* `options` object: + * `photo_id` string (optional) - Omit for text-only LLM tasks. Provide the identifier of the photo to include for multimodal LLMs. + * `config` object (required) - Configuration for the generative text engine: + * `prompt` string (optional*) - The input instruction prompt. + * `systemPrompt` string (optional*) - System instructions to guide the model's behavior, tone, or role. + * `temperature` number (optional) - Controls randomness in generation. Must be non-negative. + * `topK` number (optional) - Restricts sampling to the top K most likely tokens. Must be a positive integer. + * `topP` number (optional) - Restricts sampling to cumulative probability P. Must be non-negative. + * `maxTokens` number (optional) - Maximum number of tokens to generate. Must be a positive integer. + * `contextSize` number (optional) - Context window size. Must be a positive integer. + * `stopTokens` array (optional) - Array of non-empty strings representing tokens that halt generation. + + * **Note:** At least one of `prompt` or `systemPrompt` must be provided. + +--- + +### Mode 3: Legacy Vision ML (ONNX - Deprecated) +*Deprecated. Use Modern Vision ML config-based schemas instead.* + +* `options` object: + * `photo_id` string (required) + * `size` number (required) + * `format` string (optional) - Either `'hwc'` or `'chw'`. + * `type` string (optional) - Either `'uint8'` or `'float'`. + * `mean` array (optional) + * `std` array (optional) + +--- + +### Callback Signature +* `callback` function (required) - Executed after the inference is completed. Receives two arguments: + * `error` object - Contains error information if inference fails, otherwise `null`. + * `result` object - Contains the outputs: + * **For Vision ML / Legacy ML**: A `result.outputs` object where output arrays are automatically flattened. + * **For Generative LLM**: A `result.outputs` object containing `result.outputs.text` (the generated text response) and a `result.modelType` of `'LLM'`. + +--- ## Examples +### Example 1: Vision ML ```javascript -// Example of performing inference on a photo using a pre-trained model and handling the results +// Perform on-device image classification when a photo is added ON('add-photo', 'photos', (event) => { INFERENCE({ + model: 'fulcrum-pylon.tflite', // Model reference file uploaded to the form photo_id: event.value.id, - model: preTrainedModel, - size: 640, - format: 'chw', - type: 'float', - mean: [0.485, 0.456, 0.406], - std: [0.229, 0.224, 0.225] - }, (error, { outputs }) => { + config: { + size: 224, + format: 'chw', + inputType: 'float', + mean: [0.485, 0.456, 0.406], + std: [0.229, 0.224, 0.225] + } + }, (error, result) => { if (error) { - ALERT(error.message); + ALERT('Inference failed: ' + error.message); return; } - const results = Object.values(outputs)[0].value.map((score, index) => { - return { - index, - score, - label: LABELS[index] - }; - }); + const outputs = result.outputs; + const scores = Object.values(outputs)[0].value; + + // Process output scores... + SETVALUE('class_result', 'Successfully analyzed image!'); + }); +}); +``` - const sorted = results.sort((a, b) => b.score - a.score); +### Example 2: Modern Generative LLM +```javascript +// Use an on-device LLM to summarize notes when a record is saved +ON('save-record', () => { + const notes = VALUE('notes'); + if (!notes) return; - const topK = top != null ? sorted.slice(0, top) : sorted; + INFERENCE({ + model: 'gemma-4-e2b.litertlm', + config: { + systemPrompt: 'You are an assistant. Summarize the user text in one short sentence.', + prompt: notes, + temperature: 0.7, + maxTokens: 100 + } + }, (error, result) => { + if (error) { + ALERT('Summarization failed: ' + error.message); + return; + } - SETVALUE('my_detections', JSON.stringify(topK)); + // Access the generated response text + SETVALUE('summary', result.outputs.text); }); }); ``` ## Usage -The `INFERENCE` function is typically used when you need to perform AI-driven tasks such as image classification, object detection, or any other form of model inference. By providing a photo ID and the relevant model, you can process images directly within your application and obtain results for further action, like displaying detected objects or classifying images. - -This function is particularly useful in applications that require dynamic analysis or AI-based decision-making, enabling seamless integration of advanced machine learning models into your workflows. +The `INFERENCE` function is typically used in applications requiring offline, local, or low-latency intelligence on-device: +* **Image Recognition / Classification**: Verify image contents, detect equipment, or perform safety audits offline without any internet connection. +* **On-Device LLMs**: Perform smart form calculations, generate field summaries, suggest translations, or parse unstructured user text instantly in the field. **Note:** This feature is only available with Elite and Enterprise plans. Check out [our plans page](https://www.fulcrumapp.com/pricing/) for more information. \ No newline at end of file