Please see the contributing document.
The short of things is that for basic build and unit testing, these are needed:
- Install
golangon your system if you do not already have version >= 1.21, per thegolangdoco. - Install
pythonon your system if you do not already have version >= 3.11, available from thepythonwebsite and numerous package managers. - Using a
venvor otherwise, install the requisite python packages, eg: (system permitting) from the repository root:pip install -r cicd/requirements.txt.
Then, each of these should be run from the repository root:
- Get required `go` libs: `go get ./...`.
- Run unit tests: `python cicd/python/build.py --test`.
- Build the application: `python cicd/python/build.py --build` (this will generate an executable at `./build/stackql`).
For serious development, simulated integration tests are essential. So, there are more dependencies:
- Install
psql. On some systems, this can be done as client only and/or with various package managers; fallback is to just install postgres manually.
Having installed all dependencies, the robot tests should be run from the repository root directory (this relies upon the executable in ./build/stackql, built above):
- Run mocked functional tests: python cicd/python/build.py --robot-test. This will subject the executable to the automated testing regimen.
Running linting locally is also a nice-to-have, and can be done:
- Install
golangci-lintv1.59.1(the current version used in CI) per thegolangci-lintdoco. - From the root directory of the repository:
golangci-lint run.
The inherent difficulty in generically serialising xml is nicely expressed by the golang dev community in the encoding/xml documentation:
Mapping between XML elements and data structures is inherently flawed: an XML element is an order-dependent collection of anonymous values, while a data structure is an order-independent collection of named values. See encoding/json for a textual representation more suitable to data structures.
As of now, stackql handles xml SERDE through the core, and does not route this to SDKs. Depending on priorities, this can be revisited with care.
env CGO_ENABLED=1 PLANCACHEENABLED=false go build \
--tags "sqlite_stackql" \
-ldflags "-X github.com/stackql/stackql/internal/stackql/cmd.BuildMajorVersion=${BUILDMAJORVERSION:-1} \
-X github.com/stackql/stackql/internal/stackql/cmd.BuildMinorVersion=${BUILDMINORVERSION:-1} \
-X github.com/stackql/stackql/internal/stackql/cmd.BuildPatchVersion=${BUILDPATCHVERSION:-1} \
-X github.com/stackql/stackql/internal/stackql/cmd.BuildCommitSHA=$BUILDCOMMITSHA \
-X github.com/stackql/stackql/internal/stackql/cmd.BuildShortCommitSHA=$BUILDSHORTCOMMITSHA \
-X \"github.com/stackql/stackql/internal/stackql/cmd.BuildDate=$BUILDDATE\" \
-X \"stackql/internal/stackql/planbuilder.PlanCacheEnabled=$PLANCACHEENABLED\" \
-X github.com/stackql/stackql/internal/stackql/cmd.BuildPlatform=$BUILDPLATFORM" -o ./build ./stackqlAt this time, we are not dogmatic about how to implement unit tests. Aspirationally, unit tests can be implemented in similar fashion to the none-too opinionated official testing package documentation, and in particular the overview section.
We have upgraded to golang >= 1.25.3 in order to take advantage of time simultion and other capabilites in testing/synctest.
To run all unit tests:
go test -timeout 1200s --tags "sqlite_stackql" ./...Note: this requires the local build (above) to have been completed successfully, which builds a binary in ./build/.
env PYTHONPATH="$PYTHONPATH:$(pwd)/test/python" robot -d test/robot/reports test/robot/functionalOr better yet, if you have docker desktop and the postgres image cited in the docker compose files:
robot --variable SHOULD_RUN_DOCKER_EXTERNAL_TESTS:true -d test/robot/functional test/robot/functionalPlease see the mock testing doco.
The vscode tooling configuration is mostly ready to use, as seen in the .vscode directory. You will need to create a file at the .gitignored location .vscode/.env. Simplest thing just copy the example to get going: cp .vscode/example.env .vscode/.env.
The debugger config is pretty messy, and probably with time we will slim it down. That said, it is far from useless as an example.
Keen to expose some new functionality though stackql? We are very keen on this!
Please see registry_contribution.md.
At this stage, authentication config must be specified for each provider, even for unauthorized ones. Supported auth types are:
api_key.basic.interactivefor interactive oAuth, thus far only google supported viagcloudcommand line tool.service_accountfor json style private keys (eg: google service accounts).null_authfor unauthenticated providers.
If you want further auth types or discover bugs, please raise an issue.
Examples are present here.
Note that this feature is in alpha. We will update timelines for General Availability after a period of analysis and testing. At the time of writing, server mode is most useful for R&D purposes:
- experimentation.
- tooling / system integrations and design thereof.
- development of
stackqlitself. - development of use cases for the product.
The stackql server leverages the postgres wire protocol and can be used with the psql client, including mTLS auth / encryption in transit. Please see the relevant examples for further details.
In server mode, a thread pool issues one thread to handle each connection.
The following are single threaded:
- Lexical and Syntax Analysis.
- Semantic Analysis.
- Execution of a single, childless primitive.
- Execution of primitives a, b where a > b or b < a in the partial ordering of the plan DAG. Although it need not be the same thread executing each, they will be strictly sequential.
The following are potentially multi threaded:
- Plan optimization.
- Execution of sibling primitives.
Please consult the parser repository.
Really high level stuff:
- Cache system -> db (redis????).
- Cache size limitations and rotation policy.
- Cache persistence format from simple json -> db (redis????).
- Re-use vitess LRU Cache???
- Need reasoned view of tables / joins / rows.
- Migrate repsonses to MySQL server type a la Vitess.
- DML operations to use similar response filtering to metadata ops.
- Failure modes and possible multiple errors... how to communicate cause and final state to user. Need some overall philosophy that is extensible to transactions.
- Need reasoned view of primitives and optimisations, default extensible method map approach.
- Parallelisation of "atomic" DML ops.
Really, the github action files are the source of truth for build and test and we do encourage perusal of them. However, to keep things brief, here below is the developers' brief for testing.
Requirements are detailed in the root README.
Local testing of the application:
- Run
go test --tags "sqlite_stackql" ./...tests. - Build the executable as per the root README
- Perform registry rewrites as needed for mocking
python3 test/python/stackql_test_tooling/registry_rewrite.py --srcdir "$(pwd)/test/registry/src" --destdir "$(pwd)/test/registry-mocked/src". - Run robot tests:
- Functional tests, mocked as needed
robot -d test/robot/reports test/robot/functional. - Integration tests
robot -d test/robot/reports test/robot/integration. For these, you will need to set various envirnonment variables as per the github actions.
- Functional tests, mocked as needed
- Run the deprecated manual python tests:
- Prepare with
cp test/db/db.sqlite test/db/tmp/python-tests-tmp-db.sqlite. - Run with
python3 test/deprecated/python/main.py.
- Prepare with
This article gives a nice overview of mocking in golang.
Test coverage is sparse. Regressions are mitigated by go test integration testing in the driver and stackql packages. Some testing functionality is supported through convenience functionality inside the test packages.
If not already done, then install 'cover' with go get golang.org/x/tools/cmd/cover.
Then: go test --tags "sqlite_stackql" -cover ../....
Automated functional and integration testing are done largely through robot framework. Please see the robot test readme.
There is some legacy, deprecated manual python testing which will be migrated to robot and decommissioned.
We use golangci-lint.
The linting of go files (and also Actions) for CI is defined in .github/workflows/lint.yml.
To run the linter locally, first ensure you have the same version of golangci-lint as the CI and then either:
golangci-lint runto dump everything to console, or...golangci-lint run > cicd/log/lint.log 2>&1to send all output tocicd/log/lint.log(w.r.t repository root).
In order to support windows compilation:
brew install mingw-w64
In order to support linux compilation:
export HOMEBREW_BUILD_FROM_SOURCE=1
brew install FiloSottile/musl-cross/musl-cross
Download and unzip. For the sake of example, let us consider the executable ~/Downloads/stackql.
First:
chmod +x ~/Downloads/stackql
Then, on OSX > 10, you will need to whitelist the executable for execution even though it was not signed by an identifie developer. Least harmful way to do this is try and execute some command (below is one candidate), and then open System Settings > Security & Privacy and there should be some UI to allow execution of the untrusted stackql file. At least this works on High Sierra v1.2.1.
Then, run test commands, such as:
~/Downloads/stackql --credentialsfilepath=$HOME/stackql/stackql-devel/cicd/keys/sa-key.json exec "select group_concat(substr(name, 0, 5)) || ' lalala' as cc from google.compute.disks where project = 'lab-kr-network-01' and zone = 'australia-southeast1-b';" -o text
time ./stackql exec --cpuprofile=./select-disks-improved-05.profile --auth='{ "google": { "credentialsfilepath": "'${HOME}'/stackql/stackql-devel/cicd/keys/sa-key.json" }, "okta": { "credentialsfilepath": "'${HOME}'/stackql/stackql-devel/cicd/keys/okta-token.txt", "type": "api_key" } } ' "select name from google.compute.disks where project = 'lab-kr-network-01' and zone = 'australia-southeast1-a';"
https://docs.aws.amazon.com/sdk-for-go/api/aws/signer/v4/
INSERT RETURNING can function in two mechanisms:
- Synchronous responses, such as
google.storage.buckets. The returning clause is a projection on the immediately available reponse body. - Asynchronous responses, such as
google.compute.instancesandgoogle.compute.networks. The returning clause is a projection on the reponse body after the await flow has concluded.
Future use cases for UPDATE RETURNING, REPLACE RETURNING and DELETE RETURNING will function the same observable fashion.