Skip to content

QuantStack/pyspark-web-demo

Repository files navigation

Demo using Spark Connect in a Python Web environment

How it works

PySpark's Connect client normally talks to the server over native gRPC, that requires using HTTP/2. Only HTTP/1.1 is available in a browser/WASM runtime, so this demo swaps the transport to rewrite the transport over HTTP/1.1 (also known as gRPC-web).

On the backend, a Traefik reverse proxy is bumping back the connection to HTTP/2 (as well as handling HTTPs and some Basic Auth for this demo) and forwarding directly to a Spark cluster.

Click here to launch the Jupyterlite environement Jupyterlite

Architecture diagram

Limitation

See Spark Connect limitations.

About

A demo for using PySpark in a web client using Spark Connect over gRPC-web

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors