Macaw: lightweight monitoring over UDP

[[very_work_in_progress]]

Macaw is a lightweight runtime profiler for long-running research jobs. Take the stupid example

import macaw
import numpy as np
import time

def very_important_research(n_iterations=200):
    results = []
    
    for i, _ in macaw.track(enumerate(range(n_iterations)), total=n_iterations, desc="doing nothing"):
        with macaw.stage("generate_data"):
            data = np.random.randn(65536) * i
            macaw.log.info(f"generated {len(data)} numbers")

        with macaw.stage("fourier_transform"):
            ft = np.fft.fft(data)
            ft = np.fft.fftshift(ft)
            macaw.log.info("transformed data into different data")

    macaw.log.info(f"done")

if __name__ == "__main__":
    very_important_research()

Then the output may look like

[macaw] job started: very_important_research.py (pid: 48291, job_id: a3f2-...)

iter   1/200  [          ]   0%  eta: --:--
  → generate_data        0.001s
  → fourier_transform    0.043s
      → numpy.fft.fft       0.038s  shape=(65536,) dtype=complex128
      → numpy.fft.fftshift  0.004s  shape=(65536,)
  → crop                 0.001s
  → analysis             0.002s
  metrics: mean=2.31  is_big=False  confidence=0.71  vibes=bad

...

[INFO] done

[macaw] job finished: very_important_research.py
  elapsed:     14s
  exit code:   0
  peak memory: 200.0 MB
  iterations:  200/200

  stage summary:
  fourier_transform    avg: 0.043s  min: 0.038s  max: 0.089s  ± 0.004s
    numpy.fft.fft      avg: 0.038s  min: 0.034s  max: 0.081s  ± 0.003s
    numpy.fft.fftshift avg: 0.004s  min: 0.003s  max: 0.006s  ± 0.001s
  generate_data        avg: 0.001s  min: 0.001s  max: 0.002s  ± 0.000s

This streams to a receiver over UDP which uses SSE to stream the events to a frontend. That frontend can display this information in a dashboard.

This tool profiles runtime-defined stages and substages into corresponding runtimes, including support for wrapping common library functions. Instead of perf-like verbose timing, we get a high level profile of only the functions we care about, and user-defined stages.

The choice to use UDP is because singular events being dropped doesn't matter, and the overhead of using TCP doesn't seem to be worth it when another progress update will be sent soon anyway.

Some more concrete use cases may be

with macaw.stage("generate_data"):
  for i in macaw.track(range(1000000), total=1000000, desc="generating data"):
    generate_data(...)

with macaw.stage("consume_data"):
  ...

and this enables monitoring of data generation without needing to SSH into the box running the script, or manually dig for a log file, or something else annoying.

Structure

/server - Rust-based UDP backend /frontend - React-based frontend /sdk/py - Python library /sdk/jl - Julia library

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
server		server
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Macaw: lightweight monitoring over UDP

Structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Macaw: lightweight monitoring over UDP

Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages