The goal of funqDB is to replace the relational model, relational algebra, ORMs, and SQL in the long run.
- purely functional (key/value) data model
- same modeling concept at all levels, no matter whether we are looking at “tuples“, “relations“, or “databases“, ...
- all operators are unary: input is a function, output is a function
- query language is a façade and part of the embedding programming language
- same power for updates as for reading
- easily extensible
- the notion of an "index" is built into the data model
None of this is true for SQL.
funqDB is built around the central ideas of the vision paper:
[Dit26] Dittrich, Jens. A Functional Data Model and Query Language is All You Need. In Proceedings of the 25th International Conference on Extending Database Technology (EDBT 2026).
Abstract:We propose the vision of a functional data model (FDM) and an associated functional query language (FQL). Our proposal has far-reaching consequences: we show a path to come up with a modern query language (QL) that solves (almost if not) all problems of SQL (NULL-values, type marshalling, SQL injection, missing querying capabilities for updates, etc.). FDM and FQL are much more expressive than the relational model and SQL. In addition, in contrast to SQL, FQL integrates smoothly into existing programming languages. In our approach both QL and PL become the ‘same thing’, thus opening up several interesting holistic optimization opportunities between compilers and databases.
@inproceedings{dittrich2026FDMFQL, title={A Functional Data Model and Query Language is All You Need}, author={Jens Dittrich}, booktitle = {EDBT}, year={2026}, }
We highly recommend reading that paper to understand the motivation and the ideas behind funqDB: FDM, and FQL. This README is not meant to be a replacement for the paper, but rather a guide to the project and its current state, how to use it, and how to contribute.
This project is in an early alpha state (I started implementing end of January 2026). This project is not (yet) ready for production use. It is a proof of concept for the ideas in the paper and a playground and thought experiment and exercise how a functional data model (FDM) and functional query language (FQL) may look like. Currently, everything is implemented in Python (I love Python). Hence, performance-wise the current version will obviously not be a match against data processing done in C++ or Rust. However, keep in mind that most data management problems are "small". If you are wondering whether your data is "big" or "small", the likelihood is high (>95%) that your data is "small". By "small" I mean that it can be processed in memory on a single machine, using Python, and that will be just fine: you won't feel a performance bottleneck. The latter is actually one of the reasons why Python has become so popular for data processing in the past decade.
Anyway, notice, and as outlined in the paper, the ideas of FDM and FQL are not bound to one particular programming language: FQL is just a programming language façade (in this particular case a Python façade) for a backend. In the long run, I would like to have façades in other languages, as well as a C++ or Rust backend.
I believe that such fundamental software as data management and query processing should be open source. That is why I am publishing the project under an AGPL license. As my main job is being a Professor of Computer Science at Saarland University, I have the freedom to do this. However, I also have other obligations, e.g. teaching, research, administration, developing the Masterhorst application system, etc., so at this point, this cannot be a full time job (unfortunately).
Eventually, I would like to back the development of this project by more people, ideally as part of a foundation. If you are interested in supporting this project through such a foundation financially, get in touch.
If you are a developer and want to contribute code, in general, for bug reports and small fixes, you can just open an issue or a PR. For larger features, we first discuss the feature in an issue before starting to implement it, to make sure that we are on the same page about the feature and its implementation.
For any PR make sure:
- that your PR is about a single issue, e.g. a bug fix, a new feature, etc., not a mix of multiple issues,
- that all your code is unit tested, you should have tests for all new features and bug fixes
- that you test (line) coverage is high, ideally 100%, but at least above 90%
- that all tests pass before submitting your PR, and that you have run all tests locally before submitting your PR
- that your code is well documented:
- docstrings for all public functions and classes
- code comments for all non-trivial code
- update the documentation if you add new features or change existing ones
- ideally also update the tutorial if you add new features or change existing ones
- attribute functions (AFs) as replacements for tuples, relations, and databases, and sets of databases
- operators including simple relational algebra operators such as selection, projection, join, etc., but also more complex ones such as subdatabase, group-by, etc.
- an observer mechanism for AFs, i.e. when an AF is updated, all AFs that depend on it are informed and can react to the change (TODO: make this work through the store)
- support for composite primary keys
- relationship functions (RFs) as replacements for n:m-relationships, i.e. they can be used to express relationships between AFs, e.g. one-to-many, many-to-many, etc.
- a store for AFs, currently using SqliteDict as a key/blob-store, yet as it is used as a key/blob-store, we then cannot push down query processing
- automatic on-demand swizzling/unswizzling of references (for read, TODO: writes)
My mid term goals are:
-
to have a complete implementation of the ideas in the [Dit26] paper.
-
to have a complete tutorial and documentation for the project.
-
to have minimal transactional processing capabilities (in Python, but this can also be done in non-Python backends):
- e.g. CRUD support for concurrent updates and ACID
- MVCC
- recovery
- database versioning as of this paper
-
to have the project capable of replacing ORMs in production environments, to proof the point...:
- sample Django project using FDM and FQL rather than sqlite
- data model definitions without resorting to automatic and manual migrations
- POC for ResultDB extension
-
educational slide sets
-
educational videos that can be used for lectures
My long term goals are:
- other backends in other languages, e.g. Rust, C++, etc. (volunteers needed)
- other attribute functions beyond tabular data, e.g. tensors, etc. (volunteers needed)
- to have a complete implementation of the ideas in the [ND25] and [RD25] papers, i.e. support for database-returning queries and query optimization for database-returning (volunteers needed)
For the moment there is the option to clone or download a zip of the repository and install the dependencies
through poetry, e.g. through poetry install in the project directory.
see the tutorial which is work in progress
All tests are located in the tests directory. You can run all tests through pytest in the project directory, e.g.
through
pytest tests. You can also run individual test files, e.g. pytest tests/test_attribute_functions.py.
The tests also serve as a good starting point to understand how to use the project, as they contain a lot of textbook-style code examples. I often re-use the examples from the tests in the tutorial.
see talks
will be published here
The [Dit26] paper was preceded by a couple of previous version of that paper (with quite some variation in content):
[Dit25b] Jens Dittrich. A Functional Data Model and Query Language is All You Need. arXiv: 2507.20671 [cs.DB]. This paper also contains a lot of Python/FQL code examples.
Abstract:We propose the vision of a functional data model (FDM) and an associated functional query language (FQL). Our proposal has far-reaching consequences: we show a path to come up with a modern QL that solves (almost if not) all problems of SQL (NULL-values, impedance mismatch, SQL injection, missing querying capabilities for updates, etc.). FDM and FQL are much more expressive than the relational model and SQL. In addition, in contrast to SQL, FQL integrates smoothly into existing programming languages. In our approach both QL and PL become the "same thing", thus opening up some interesting holistic optimization opportunities between compilers and databases. In FQL, we also do not need to force application developers to switch to unfamiliar programming paradigms (like SQL or datalog): developers can stick with the abstractions provided by their programming language.
@misc{dittrich2025functionaldatamodelquery, title={A Functional Data Model and Query Language is All You Need}, author={Jens Dittrich}, year={2025}, eprint={2507.20671}, archivePrefix={arXiv}, primaryClass={cs.DB}, url={https://arxiv.org/abs/2507.20671}, }
But all of this started with this one: this was a thought experiment to explore the ideas that eventually led to the vision papers [Dit25b] and then [Dit26]. It contains a lot of code examples and is a good read to understand the motivation and the ideas behind funqDB.
[Dit25a] Jens Dittrich. How to get Rid of SQL, Relational Algebra, the Relational Model, ERM, and ORMs in a Single Paper -- A Thought Experiment arXiv:2504.12953 [cs.DB]
Abstract:Without any doubt, the relational paradigm has been a huge success. At the same time, we believe that the time is ripe to rethink how database systems could look like if we designed them from scratch. Would we really end up with the same abstractions and techniques that are prevalent today? This paper explores that space. We discuss the various issues with both the relational model(RM) and the entity-relationship model (ERM). We provide a unified data model: the relational map type model (RMTM) which can represent both RM and ERM as special cases and overcomes all of their problems. We proceed to identify seven rules that an RMTM query language (QL) must fulfill and provide a foundation of a language fulfilling all seven rules. Our QL operates on maps which may represent tuples, relations, databases or sets of databases. Like that we dramatically expand the existing operational abstractions found in SQL and relational algebra (RA) which only operate on relations/tables. In fact, RA is just a special case of our much more generic approach. This work has far-reaching consequences: we show a path how to come up with a modern QL that solves (almost if not) all problems of SQL. Our QL is much more expressive than SQL and integrates smoothly into existing programming languages ( PL). We also show results of an initial experiment showcasing that just by switching to our data model, and without changing the underlying query processing algorithms, we can achieve speed-ups of up to a factor 3. We will conclude that, if we build a database system from scratch, we could and should do this without SQL, RA, RM, ERM, and ORMs.
@misc{dittrich2025ridsqlrelationalalgebra, title={How to get Rid of SQL, Relational Algebra, the Relational Model, ERM, and ORMs in a Single Paper -- A Thought Experiment}, author={Jens Dittrich}, year={2025}, eprint={2504.12953}, archivePrefix={arXiv}, primaryClass={cs.DB}, url={https://arxiv.org/abs/2504.12953}, }
The following papers are also related. In these works we proposed to change SQL to return a subdatabase and how to perform query processing and optimization accordingly:
[ND25] Joris Nix, Jens Dittrich. Extending SQL to Return a Subdatabase. SIGMOD 2025.
Abstract:Every SQL statement is limited to return a single, possibly denormalized table. This approximately 50-year-old design decision has far-reaching consequences. The most apparent problem is the redundancy introduced through denormalization, which can result in long transfer times of query results and high memory usage for materializing intermediate results. Additionally, regardless of their goals, users are forced to fit query computations into one single result, mixing the data retrieval and transformation aspect of SQL. Moreover, both problems violate the principles and core ideas of normal forms. In this paper, we argue for eliminating the single-table limitation of SQL. We extend SQL's SELECT clause by the keyword `RESULTDB' to support returning a result subdatabase. Our extension has clear semantics, i.e., by annotating any existing SQL statement with the RESULTDB keyword, the DBMS returns the tables participating in the query, each restricted to the relevant tuples that occur in the traditional single-table query result. Thus, we do not denormalize the query result in any way. Our approach has significant, far-reaching consequences, impacting the querying of hierarchical data, materialized views, and distributed databases, while maintaining backward compatibility. In addition, our proposal paves the way for a long list of exciting future research opportunities. We propose multiple algorithms to integrate our feature into both closed-source and open-source database systems. For closed-source systems, we provide several SQL-based rewrite methods. In addition, we present an efficient algorithm for cyclic and acyclic join graphs that we integrated into an open-source database system. We conduct a comprehensive experimental study. Our results show that returning multiple individual result sets can significantly decrease the result set size. Furthermore, our rewrite methods and algorithm introduce minimal overhead and can even outperform single-table execution in certain cases.
@inproceedings{Nix2025ExtendingSQL, title={Extending SQL to Return a Subdatabase}, author={Joris Nix and Jens Dittrich}, year={2025} booktitle = {SIGMOD}, publisher = {ACM} }
[RD25] Simon Rink, Jens Dittrich. Query Optimization for Database-Returning Queries. SIGMOD 2026.
Abstract:Recently, the novel concept of database-returning SQL queries (DRQs) was introduced. Instead of a single, (potentially) denormalized result table, DRQs return an entire subdatabase with a single SQL query. This subdatabase represents a subset of the original database, reduced to the relations, tuples, and attributes that contribute to the traditional join result. DRQs offer several benefits: they reduce network traffic in client-server settings, can lower memory requirements for materializing results, and significantly simplify querying hierarchical data. Currently, two state-of-the-art algorithms exist to compute DRQs: (1.) ResultDBSemi-Join builds upon Yannakakis’ semi-join reduction algorithm by adding support for cyclic queries. (2.) ResultDBDecompose computes the standard single-table result and projects the result to the base tables to obtain the resulting subdatabase. However, multiple issues can be identified with these algorithms. First, ResultDBSemi-Join employs simple heuristics to greedily solve the underlying enumeration problems, often leading to suboptimal query plans. Second, each algorithm performs best under different conditions, so it is up to the user to choose the appropriate one for a given scenario. In this paper, we address these two issues. We propose two enumeration algorithms for ResultDBSemi-Join to handle the Root Node Enumeration Problem ( RNEP) and the Tree Folding Enumeration Problem (TFEP). Further, we present a unified enumeration algorithm, TDResultDB, to automatically decide between plans generated by our new enumeration algorithms for ResultDBSemi-Join and ResultDBDecompose. Through a comprehensive experimental evaluation, we show that the enumeration time overhead introduced by our methods remains minimal. Furthermore, by effectively solving the RNEP and TFEP, we achieve up to a 6x speed-up in query execution time for ResultDBSemi-Join, whereas TDResultDB consistently selects the best available plans.
@inproceedings{RinkD2026, title={Query Optimization for Database-Returning Queries}, author={Simon Rink and Jens Dittrich}, issue_date = {December 2025}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, volume = {3}, number = {6}, journal = {Proc. ACM Manag. Data}, url = {https://doi.org/10.1145/3769818}, doi = {10.1145/3769818}, }
This project is licenses under the AGPL-3.0 License. See the LICENSE file for more details.