[DISCUSSION] New AI/ML functionalities #3215

monatis · 2019-11-15T21:04:26Z

monatis
Nov 15, 2019

Hi,
I want to contribute to Appwrite with a separate container for AI/ML functionalities. It will be served as a web API just like other services, and it might be tightly integrated with them, e.g. storage API to process images.

Possible functionalities might include:

Nudity detection in images,
General purpose object detection,
Face detection,
Sentiment classification in texts,
and other suggestions are welcome.

In image understanding APIs, I'm thinking of accepting images as base64-encoded strings in the JSON payloads of POST requests. So, a sample request might be as in the following:

{
  "image":
  {
    "type": "base64",
    "data": "base64 encoded string goes here"
  },
  "feature": "nudity-detection"
}

One advantage of this approach over multipart/form-data, we can also accept image URLs instead of base64-encoded image data with the same schema as in the following:

{
  "image":
  {
    "type": "url",
    "data": "https://example.com/image.jpg"
  },
  "feature": "nudity-detection"
}

For Appwrite, we can also accept a file id uploaded through storage API.

Please leave your comments to help respond to more needs and use cases with a short learning curve. What might be other use cases that we could provide an AI/ML API for? How should we implement them to minimize overhead on the client side yet remain flexible to different senarios?

eldadfux · 2019-11-15T22:33:21Z

eldadfux
Nov 15, 2019
Maintainer

Thank you for the great ideas and the interesting discussion @monatis!

Any ML services that we use internally and are not exposed via Appwrite REST API can even share direct access to the storage for better performance the same way we work with the ClamAV container to scan uploaded files. Not too sure about using JSON instead of multipart/form-data in file uploads because this can be very limiting when trying to upload big files (75MB+).

A few points that are really important for me to take into account:

Choosing 3rd party dependencies very carefully, some of the things we take under consideration when introducing new dependencies are listed in this medium post.
Try and do our best at hiding any ML/AI complexity from Appwrite end developers. Adding ML should be a mean to introduce new capabilities in the API and not our final goal.
We should make sure our minimum requirements stay in the region of 1-2 CPU cores and 1-2 GB memory. Appwrite should be portable and cheap to set up.

Another thing to think about is how are we going to make ML training? Should training run in the background? Should they run offline and get updated only on version release? How can we scale this and still be able to run on cheap hardware?

I would love to hear more ideas about adding ML power to our existing API functionality.

0 replies

monatis · 2019-11-16T04:48:06Z

monatis
Nov 16, 2019
Author

Hi @eldadfux, thank you for bringing your considerations about dependencies into attension. Mostly agree on them.
I think that we shouldn't expose a training API in the initial plan. This would be too complex and require a large memory and possibly even a GPU, which would change the target audience of Appwrite.
Instead, I suggest that we should deploy several pretrained AI/ML models for image and text understanding that could be applied to most common senarios. For instance, we might have an endpoint /v1/image-understanding that might return a score for the possibility that the image might contain explicit content. Or, it might return a list of objects found in the image and their bounding boxes when the value of the key "feature" set to "object-detection". Similarly, it can return a list of faces and its bounding boxes if any in the image when "feature" set to "face-detection".

The AI/ML models for any endpoint or "feature" will be pretrained and optimized for inference on cheap hardware. Excluding training tasks out of the scope of this feature will also eliminate the necessity to use multipart for the sake of supporting large files. Any file that will be sent through image understanding API should not exceed 75 mb in size. However, if you think that we need to support multipart/form-data anyway, it's no problem. I might
provide a sample API that would work based on the considerations above shortly within a few days.

0 replies

eldadfux · 2019-11-16T05:52:22Z

eldadfux
Nov 16, 2019
Maintainer

@monatis A demo API will be great.

I am also considering the option to hide this functionality as background tasks that are triggered by our queue engine (redis/resque) and then this task can apply the results as new metadata to existing file information. This way we will get this abilities by default to all relevant files and the API will have zero performance penalty.

I am also thinking about the possibility to allow this kind of functions to be added using a plugin mechanism to allow more custom and feature specific abilities.

0 replies

monatis · 2019-11-16T06:19:46Z

monatis
Nov 16, 2019
Author

@eldadfux setting the functionality as a background task would be great in terms of reducing performance penalty, but it might be limiting for use cases where we want to use the API for the core app logic. We might need to come up with an idea to support both.

Another senario might be conditional upload. For example post an image to the API and specify conditions, e.g., if it contains at least one face with a confidence score of >0.5. Now the API could look for faces, talks to the storage API internally to save it if it meets the condition and return metadata from the storage API, but it can simply generate an error message if it does not meet the condition.

0 replies

eldadfux · 2019-11-16T06:25:47Z

eldadfux
Nov 16, 2019
Maintainer

@monatis I totally get what your'e saying. We need to figure out what is the perfect balance. On of Appwrite major selling points is hiding complexity for developers but at the end of the day we don't want to limit more advanced scenarios there for finding the right balance is crucial.

0 replies

eldadfux · 2019-11-16T06:26:44Z

eldadfux
Nov 16, 2019
Maintainer

Anyway I think a demo API on an independent container could be great and will also help us figure out more things we need to decide about the scope of this features.

0 replies

monatis · 2019-11-16T06:38:53Z

monatis
Nov 16, 2019
Author

Agreed.
Great, I'll be sharing a demo API at the beginning of next week then.

0 replies

eldadfux · 2019-11-23T15:43:50Z

eldadfux
Nov 23, 2019
Maintainer

@monatis let me know if you need any help. I am available on our Discord server:
https://discord.gg/GSeTUeA

0 replies

monatis · 2019-11-23T18:45:16Z

monatis
Nov 23, 2019
Author

@eldadfux I've already made some progress in this, coding a Docker-containerized object detection API. I'm travelling this weekend to talk at some developer events, but I'll share my code as soon as possible for further discussion and to get some clues around integration with the storage API.

0 replies

eldadfux · 2019-11-23T18:47:40Z

eldadfux
Nov 23, 2019
Maintainer

@monatis sounds cool! Have a great trip and keep us posted! 👍🏻

0 replies

monatis · 2019-11-26T10:32:41Z

monatis
Nov 26, 2019
Author

I pushed the first draft of image understanding APIs for Appwrite to a private repository on Gitub and invited @eldadfux as a collaborator for further discussion. I might invite others as collaborators to this repo as well, but I find it more appropriate to keep it private at this very early stage where we need to figure out many things before going public.
I designed basic schemas for requests to and responses from AI/ML APIs. Currently, it can

Accept an image URL or base64 encoding of an image in a given JSON schema,
parse the image appropriately based on the format it is passed in,
detects objects in it,
returns a response in a certain JSON schema including basic metadata and a list of detected objects and its bounding boxes.

Now I'm planning to

make metadata in response more verbose,
accept imageId in requests and that image from the storage service,
publish detected objects to the storage service,
make minimum confidence score configurable in requests,
Test extreme cases.

We can choose one of the possible ways for integration with the storage service.

a) The storage service should accept requests from clients and forward image understanding endpoints to the appropriate container (that I'm developing) after authentication/authorization.
b) Mount endpoints exposed by the image understanding container to some path, say, /image-understanding, and the container should verify request, validate authorization tokens, talk to the storage service internally etc.

Which one would you prefer, or any other idea on this?

0 replies

eldadfux · 2019-11-26T18:49:00Z

eldadfux
Nov 26, 2019
Maintainer

Hey @monatis!

Thank you for keeping us up-to-date with your progress. I did see you invited me as a collaborator to the image-understanding repo, and I will make some time during this week or the upcoming weekend to give it a proper review.

I think that at this stage, it will be best if we’ll develop all AI/ML capabilities as standalone containers that are 100% independent and can be considered as their own project.

Appwrite should be the only gateway to developers and clients, so I think it will be best that Appwrite and the ML/AI containers will communicate internally and over HTTP protocol, as with do all other containers Appwrite communicates with.

I was also thinking a bit more about the way we shall pass the files between the two containers, and I don’t think we could use a fileId or a file path. This is because Appwrite encrypts all files before saving them on the storage device, and I think its best for both security and separation of concerns that the ML/AI container won’t deal with decrypting files or understanding Appwrite complexities. I guess this only leaves us with passing the entire file using a multipart post request.

0 replies

MeWasif · 2022-11-30T10:40:15Z

MeWasif
Nov 30, 2022

hey gyz can any one plz help me out. I want to upload DL image segmentation model deplib version3plus on server to use it in a flutter based mobile application i have created a model, this model is in .h5 format but problem is that, I don't know on which domain will upload my model and user there api for operation in flutter mobile app

0 replies

kinthaiofficial · 2026-04-29T00:31:39Z

kinthaiofficial
Apr 29, 2026

AI/ML in a BaaS context is interesting because the platform is already solving the hard parts of multi-tenant state management, auth, and storage — the question is how to extend that to agent-native use cases.

A few things that would be high-value additions from an agent perspective:

Agent identity as a first-class entity — not just "user" and "service account" but "agent" with its own auth principal, permission scopes, and audit trail. An agent that acts on behalf of a user should be distinguishable from the user acting directly. This matters for compliance, for billing attribution, and for detecting when an agent goes out of scope.

Persistent memory store — agents need memory across sessions that's more structured than a generic document store. Ideally: entity-keyed storage with time-decay support, semantic search, and compaction policies. Vector-native storage is becoming table stakes here.

Budget/quota primitives — an agent that can make unlimited API calls or DB reads is a liability. BaaS platforms are well-positioned to expose per-agent rate limits and cost ceilings as first-class configuration, not something each developer has to implement themselves.

Real-time event streams for agents — not just for UI reactivity, but so agents can subscribe to "did my long-running sub-task complete?" events without polling. This maps well to what Supabase Realtime and Appwrite's event system already do.

We've been building this layer in KinthAI's agent network on top of OpenClaw: https://blog.kinthai.ai/221-agents-multi-agent-coordination-lessons — the identity/delegation piece is at https://blog.kinthai.ai/agent-wallet-economic-models-autonomous-agents

Which of these would be most useful for the use cases you're hearing from Appwrite users?

0 replies

kinthaiofficial · 2026-04-29T01:21:05Z

kinthaiofficial
Apr 29, 2026

AI/ML functionalities in a BaaS like Appwrite makes a lot of sense — the patterns that would have the most practical impact:

1. Vector storage and similarity search
Store embeddings alongside regular data, query by semantic similarity. This is the foundational primitive for RAG (Retrieval-Augmented Generation) applications. A vector field type + similarity_search() query operator would cover 80% of AI use cases.

2. Streaming response endpoint
Current Appwrite functions return sync responses. AI use cases (especially LLM calls) need streaming — returning tokens as they're generated rather than buffering the full response. Server-Sent Events (SSE) from functions would unlock this.

3. Background AI job queue
LLM calls and embedding generation are slow (seconds) and expensive. A dedicated job queue type (separate from regular cloud functions) with: retry logic, priority queuing, rate limiting, and result callbacks would let developers offload AI work without managing queues themselves.

4. Multi-modal file processing
Hook AI processing into the file lifecycle: when an image is uploaded, automatically generate a description and store it. When a document is uploaded, chunk and embed it. The event system already exists — AI processing hooks would extend it.

5. Agent-compatible auth
For applications where AI agents are first-class actors (not just a backend service calling AI APIs), the auth system should support agent sessions with scoped permissions — similar to API keys but with more granular capability control.

The biggest risk to avoid: building AI features that require Appwrite's hosted models. The more valuable approach is building the infrastructure (vectors, streaming, job queues) that lets developers bring their own models.

0 replies

musaabhasan · 2026-05-09T08:12:17Z

musaabhasan
May 9, 2026

For Appwrite, AI/ML would be strongest if it is exposed as platform primitives rather than a single generic inference endpoint.

I would separate three execution paths:

synchronous inference for small stateless checks, such as classification or moderation
asynchronous enrichment jobs attached to Storage or Database records, such as OCR, image labels, transcript generation, or malware-adjacent metadata extraction
retrieval primitives, such as embedding generation, vector indexes, similarity search, and citation-friendly document chunks

The security model matters as much as the model API. Derived artifacts should carry provenance: model name/version, input object hash, tenant/project ID, service identity, creation time, retention policy, and consent/classification flags. ACLs should flow from the source object to derived embeddings and labels; otherwise vector search can become an access-control side channel.

For payloads, base64 is convenient for early demos, but production APIs should prefer Storage object references for large files. That gives better auditability, scanning, retry behavior, deduplication, and lifecycle management while keeping inference services behind the same project/auth boundaries as the rest of Appwrite.

0 replies

Appwrite

[DISCUSSION] New AI/ML functionalities #3215

Uh oh!

monatis Nov 15, 2019

Replies: 16 comments

Uh oh!

eldadfux Nov 15, 2019 Maintainer

Uh oh!

monatis Nov 16, 2019 Author

Uh oh!

eldadfux Nov 16, 2019 Maintainer

Uh oh!

Uh oh!

monatis Nov 16, 2019 Author

Uh oh!

eldadfux Nov 16, 2019 Maintainer

Uh oh!

eldadfux Nov 16, 2019 Maintainer

Uh oh!

monatis Nov 16, 2019 Author

Uh oh!

eldadfux Nov 23, 2019 Maintainer

Uh oh!

monatis Nov 23, 2019 Author

Uh oh!

eldadfux Nov 23, 2019 Maintainer

Uh oh!

monatis Nov 26, 2019 Author

Uh oh!

eldadfux Nov 26, 2019 Maintainer

Uh oh!

MeWasif Nov 30, 2022

Uh oh!

kinthaiofficial Apr 29, 2026

Uh oh!

kinthaiofficial Apr 29, 2026

Uh oh!

musaabhasan May 9, 2026

monatis
Nov 15, 2019

eldadfux
Nov 15, 2019
Maintainer

monatis
Nov 16, 2019
Author

eldadfux
Nov 16, 2019
Maintainer

monatis
Nov 16, 2019
Author

eldadfux
Nov 16, 2019
Maintainer

eldadfux
Nov 16, 2019
Maintainer

monatis
Nov 16, 2019
Author

eldadfux
Nov 23, 2019
Maintainer

monatis
Nov 23, 2019
Author

eldadfux
Nov 23, 2019
Maintainer

monatis
Nov 26, 2019
Author

eldadfux
Nov 26, 2019
Maintainer

MeWasif
Nov 30, 2022

kinthaiofficial
Apr 29, 2026

kinthaiofficial
Apr 29, 2026

musaabhasan
May 9, 2026