Replies: 16 comments
-
|
Thank you for the great ideas and the interesting discussion @monatis! Any ML services that we use internally and are not exposed via Appwrite REST API can even share direct access to the storage for better performance the same way we work with the ClamAV container to scan uploaded files. Not too sure about using JSON instead of multipart/form-data in file uploads because this can be very limiting when trying to upload big files (75MB+). A few points that are really important for me to take into account:
Another thing to think about is how are we going to make ML training? Should training run in the background? Should they run offline and get updated only on version release? How can we scale this and still be able to run on cheap hardware? I would love to hear more ideas about adding ML power to our existing API functionality. |
Beta Was this translation helpful? Give feedback.
-
|
Hi @eldadfux, thank you for bringing your considerations about dependencies into attension. Mostly agree on them. The AI/ML models for any endpoint or |
Beta Was this translation helpful? Give feedback.
-
|
@monatis A demo API will be great. I am also considering the option to hide this functionality as background tasks that are triggered by our queue engine (redis/resque) and then this task can apply the results as new metadata to existing file information. This way we will get this abilities by default to all relevant files and the API will have zero performance penalty. I am also thinking about the possibility to allow this kind of functions to be added using a plugin mechanism to allow more custom and feature specific abilities. |
Beta Was this translation helpful? Give feedback.
-
|
@eldadfux setting the functionality as a background task would be great in terms of reducing performance penalty, but it might be limiting for use cases where we want to use the API for the core app logic. We might need to come up with an idea to support both. Another senario might be conditional upload. For example post an image to the API and specify conditions, e.g., if it contains at least one face with a confidence score of >0.5. Now the API could look for faces, talks to the storage API internally to save it if it meets the condition and return metadata from the storage API, but it can simply generate an error message if it does not meet the condition. |
Beta Was this translation helpful? Give feedback.
-
|
@monatis I totally get what your'e saying. We need to figure out what is the perfect balance. On of Appwrite major selling points is hiding complexity for developers but at the end of the day we don't want to limit more advanced scenarios there for finding the right balance is crucial. |
Beta Was this translation helpful? Give feedback.
-
|
Anyway I think a demo API on an independent container could be great and will also help us figure out more things we need to decide about the scope of this features. |
Beta Was this translation helpful? Give feedback.
-
|
Agreed. |
Beta Was this translation helpful? Give feedback.
-
|
@monatis let me know if you need any help. I am available on our Discord server: |
Beta Was this translation helpful? Give feedback.
-
|
@eldadfux I've already made some progress in this, coding a Docker-containerized object detection API. I'm travelling this weekend to talk at some developer events, but I'll share my code as soon as possible for further discussion and to get some clues around integration with the storage API. |
Beta Was this translation helpful? Give feedback.
-
|
@monatis sounds cool! Have a great trip and keep us posted! 👍🏻 |
Beta Was this translation helpful? Give feedback.
-
|
I pushed the first draft of image understanding APIs for Appwrite to a private repository on Gitub and invited @eldadfux as a collaborator for further discussion. I might invite others as collaborators to this repo as well, but I find it more appropriate to keep it private at this very early stage where we need to figure out many things before going public.
Now I'm planning to
We can choose one of the possible ways for integration with the storage service.
Which one would you prefer, or any other idea on this? |
Beta Was this translation helpful? Give feedback.
-
|
Hey @monatis! Thank you for keeping us up-to-date with your progress. I did see you invited me as a collaborator to the image-understanding repo, and I will make some time during this week or the upcoming weekend to give it a proper review. I think that at this stage, it will be best if we’ll develop all AI/ML capabilities as standalone containers that are 100% independent and can be considered as their own project. Appwrite should be the only gateway to developers and clients, so I think it will be best that Appwrite and the ML/AI containers will communicate internally and over HTTP protocol, as with do all other containers Appwrite communicates with. I was also thinking a bit more about the way we shall pass the files between the two containers, and I don’t think we could use a fileId or a file path. This is because Appwrite encrypts all files before saving them on the storage device, and I think its best for both security and separation of concerns that the ML/AI container won’t deal with decrypting files or understanding Appwrite complexities. I guess this only leaves us with passing the entire file using a multipart post request. |
Beta Was this translation helpful? Give feedback.
-
|
hey gyz can any one plz help me out. I want to upload DL image segmentation model deplib version3plus on server to use it in a flutter based mobile application i have created a model, this model is in .h5 format but problem is that, I don't know on which domain will upload my model and user there api for operation in flutter mobile app |
Beta Was this translation helpful? Give feedback.
-
|
AI/ML in a BaaS context is interesting because the platform is already solving the hard parts of multi-tenant state management, auth, and storage — the question is how to extend that to agent-native use cases. A few things that would be high-value additions from an agent perspective: Agent identity as a first-class entity — not just "user" and "service account" but "agent" with its own auth principal, permission scopes, and audit trail. An agent that acts on behalf of a user should be distinguishable from the user acting directly. This matters for compliance, for billing attribution, and for detecting when an agent goes out of scope. Persistent memory store — agents need memory across sessions that's more structured than a generic document store. Ideally: entity-keyed storage with time-decay support, semantic search, and compaction policies. Vector-native storage is becoming table stakes here. Budget/quota primitives — an agent that can make unlimited API calls or DB reads is a liability. BaaS platforms are well-positioned to expose per-agent rate limits and cost ceilings as first-class configuration, not something each developer has to implement themselves. Real-time event streams for agents — not just for UI reactivity, but so agents can subscribe to "did my long-running sub-task complete?" events without polling. This maps well to what Supabase Realtime and Appwrite's event system already do. We've been building this layer in KinthAI's agent network on top of OpenClaw: https://blog.kinthai.ai/221-agents-multi-agent-coordination-lessons — the identity/delegation piece is at https://blog.kinthai.ai/agent-wallet-economic-models-autonomous-agents Which of these would be most useful for the use cases you're hearing from Appwrite users? |
Beta Was this translation helpful? Give feedback.
-
|
AI/ML functionalities in a BaaS like Appwrite makes a lot of sense — the patterns that would have the most practical impact: 1. Vector storage and similarity search 2. Streaming response endpoint 3. Background AI job queue 4. Multi-modal file processing 5. Agent-compatible auth The biggest risk to avoid: building AI features that require Appwrite's hosted models. The more valuable approach is building the infrastructure (vectors, streaming, job queues) that lets developers bring their own models. |
Beta Was this translation helpful? Give feedback.
-
|
For Appwrite, AI/ML would be strongest if it is exposed as platform primitives rather than a single generic inference endpoint. I would separate three execution paths:
The security model matters as much as the model API. Derived artifacts should carry provenance: model name/version, input object hash, tenant/project ID, service identity, creation time, retention policy, and consent/classification flags. ACLs should flow from the source object to derived embeddings and labels; otherwise vector search can become an access-control side channel. For payloads, base64 is convenient for early demos, but production APIs should prefer Storage object references for large files. That gives better auditability, scanning, retry behavior, deduplication, and lifecycle management while keeping inference services behind the same project/auth boundaries as the rest of Appwrite. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
I want to contribute to Appwrite with a separate container for AI/ML functionalities. It will be served as a web API just like other services, and it might be tightly integrated with them, e.g. storage API to process images.
Possible functionalities might include:
and other suggestions are welcome.
In image understanding APIs, I'm thinking of accepting images as base64-encoded strings in the JSON payloads of POST requests. So, a sample request might be as in the following:
{ "image": { "type": "base64", "data": "base64 encoded string goes here" }, "feature": "nudity-detection" }One advantage of this approach over multipart/form-data, we can also accept image URLs instead of base64-encoded image data with the same schema as in the following:
For Appwrite, we can also accept a file id uploaded through storage API.
Please leave your comments to help respond to more needs and use cases with a short learning curve. What might be other use cases that we could provide an AI/ML API for? How should we implement them to minimize overhead on the client side yet remain flexible to different senarios?
Beta Was this translation helpful? Give feedback.
All reactions