Async http(s) ClickHouse client for python 3.6+ with types converting in both directions, streaming support, lazy decoding on select queries and fully typed interface
> pip install aiochclient
Or to install with extras requirements for speedup:
> pip install aiochclient[speedups]
It will additionally install cChardet
and aiodns for aiohttp speedup
and ciso8601 for ultra fast
datetime parsing while decoding data from ClickHouse.
Also while installing it will try to build Cython extensions for speed boost (about 30%).
aiochclient needs aiohttp.ClientSession for connecting:
from aiochclient import ChClient
from aiohttp import ClientSession
async def main():
async with ClientSession() as s:
client = ChClient(s)
assert await client.is_alive() # returns True if connection is Okawait client.execute(
"CREATE TABLE t (a UInt8, b Tuple(Date, Nullable(Float32))) ENGINE = Memory"
)For INSERT queries you can pass values as *args. Values should be iterables:
await client.execute(
"INSERT INTO t VALUES",
(1, (dt.date(2018, 9, 7), None)),
(2, (dt.date(2018, 9, 8), 3.14)),
)For fetching all rows at once use fetch method:
all_rows = await client.fetch("SELECT * FROM t")For fetching first row from result use fetchrow method:
row = await client.fetchrow("SELECT * FROM t WHERE a=1")
assert row[0] == 1
assert row["b"] == (dt.date(2018, 9, 7), None)You can also use fetchval method, which returns
first value of the first row from query result:
val = await client.fetchval("SELECT b FROM t WHERE a=2")
assert val == (dt.date(2018, 9, 8), 3.14)With async iteration on query results steam you can fetch multiple rows without loading them all into memory at once:
async for row in client.iterate(
"SELECT number, number*2 FROM system.numbers LIMIT 10000"
):
assert row[0] * 2 == row[1]Use fetch/fetchrow/fetchval/iterate for SELECT queries
and execute or any of last for INSERT and all another queries.
All fetch queries return rows as lightweight, memory
efficient objects (from v1.0.0, before it - just tuples)
with full mapping interface, where
you can get fields by names or by indexes:
row = await client.fetchrow("SELECT a, b FROM t WHERE a=1")
assert row["a"] == 1
assert row[0] == 1
assert row[:] == (1, (dt.date(2018, 9, 8), 3.14))
assert list(row.keys()) == ["a", "b"]
assert list(row.values()) == [1, (dt.date(2018, 9, 8), 3.14)]aiochclient automatically converts values to needed type both
from ClickHouse response and for client INSERT queries.
| ClickHouse type | Python type |
|---|---|
UInt8 |
int |
UInt16 |
int |
UInt32 |
int |
UInt64 |
int |
Int8 |
int |
Int16 |
int |
Int32 |
int |
Int64 |
int |
Float32 |
float |
Float64 |
float |
String |
str |
FixedString |
str |
Enum8 |
str |
Enum16 |
str |
Date |
datetime.date |
DateTime |
datetime.datetime |
Decimal |
decimal.Decimal |
Decimal32 |
decimal.Decimal |
Decimal64 |
decimal.Decimal |
Decimal128 |
decimal.Decimal |
Tuple(T1, T2, ...) |
Tuple[T1, T2, ...] |
Array(T) |
List[T] |
UUID |
uuid.UUID |
Nullable(T) |
None or T |
Nothing |
None |
LowCardinality(T) |
T |
If you want to change connection pool size, you can use aiohttp.TCPConnector. Note that by default pool limit is 100 connections.
Using of uvloop and installing with aiochclient[speedups]
is highly recommended for sake of speed.
As for the last version of aiochclient its speed
using one task (without gather or parallel
clients and so on) is about
180k-220k rows/sec on SELECT and about
50k-80k rows/sec on INSERT queries
depending on its environment and ClickHouse settings.
Please star️ this repository if this project helped you!