Programming with Pixels (PwP) - Package Documentation

This document provides detailed information about the PwP package structure, architecture, and usage patterns.

Architecture Overview

PwP is built around a core Docker-based environment system that allows AI agents to interact with visual interfaces through screenshots and input commands. The system follows a modular design with the following key components:

pwp/
├── env/         # Environment module for Docker-based visual interfaces
├── bench/       # Benchmark module for evaluating agents
├── agents/      # Agent implementations
├── utils/       # Utility functions and helpers
├── tools/       # Tools for agent interaction
├── functions/   # Function implementations for tools
├── prompts/     # Prompt templates for agents
└── docker/      # Docker configuration files for environments

Module Details

`pwp.env` - Environment Module

The core environment module provides the PwP class, which manages Docker containers for visual interface interaction.

Key Features

Docker container management (creation, starting, stopping)
Screenshot capture and rendering
Command execution within the container
File manipulation within the container
VNC support for remote viewing
Checkpointing system for environment state preservation

Example Usage

from pwp.env import PwP

# Create a basic environment
env = PwP(image_name='pwp_env')

# Execute a command
result = env.step("ls -la")
print(result['output'])

# Take a screenshot
screenshot = env.render()
screenshot.save('current_state.png')

# Get DOM structure with bounding boxes
annotated_img, dom_data = env.get_som_image(screenshot)
annotated_img.save('annotated_screenshot.png')

# Get currently visible file in editor
file_view = env.get_file_view()
print(f"Current file: {file_view['filePath']}")
print(f"Cursor position: {file_view['cursorPosition']}")

# Create a checkpoint
env.add_checkpoint("my_checkpoint")

# Restore from checkpoint
env.restore_checkpoint("my_checkpoint")

# Clean up
env.stop()
env.remove()

`pwp.bench` - Benchmark Module

The benchmark module provides the PwPBench class, which manages benchmark tasks and evaluation.

Supported Benchmarks

PwP supports a wide range of benchmark tasks:

humaneval: Python coding problems
swebench: Software engineering benchmark
swebench-java: Java software engineering benchmark
dsbench: Data science benchmark
chartmimic: Chart recreation tasks
intercode: Interactive coding in bash, SQL, CTF
design2code: Converting design mockups to code
canitedit: Code editing tasks
resq: Reasoning about SQL queries
minictx: Minimal context understanding
bird: BI reporting dashboard tasks
vscode: VSCode-specific tasks
nocode: No-code tool interaction
swebench-mm: Multimodal software engineering

Example Usage

from pwp.bench import PwPBench

# Create a benchmark instance
bench = PwPBench('humaneval')

# Get the dataset
dataset = bench.get_dataset()
print(f"Loaded {len(dataset)} tasks")

# Create an environment for a specific task
env = bench.get_env(dataset[0])

# Evaluate a solution
reward = bench.get_reward(env, dataset[0])
print(f"Task reward: {reward}")

`pwp.agents` - Agent Module

The agents module provides implementations of different agent architectures for interacting with visual interfaces.

Available Agents

AssistedAgent: Agent that receives assistance from a human
ComputerUseAgent: Agent that interacts directly with the computer

`pwp.utils` - Utilities Module

The utilities module provides helper functions for various tasks:

Image processing and manipulation
DOM element parsing and visualization
LLM utilities for embedding and encoding
Caching utilities

Example Usage

from pwp.utils.utils import draw_bounding_boxes

# Draw bounding boxes on an image based on DOM data
annotated_img = draw_bounding_boxes(
    dom_data_csv, 
    screenshot,
    viewport_size={'height': 1080, 'width': 1920},
    caption_icons=True
)

`pwp.tools` - Tools Module

The tools module provides implementations of tools that agents can use to interact with environments.

Available Tool Categories

Computer interaction tools (mouse, keyboard)
File system tools (read, write, search)
UI analysis tools (element identification)
DOM manipulation tools

`pwp.functions` - Functions Module

The functions module provides implementations of functions that are called by tools.

`pwp.prompts` - Prompts Module

The prompts module provides templates for different agent types.

`pwp.docker` - Docker Configuration

The docker module contains environment Dockerfile and other configuration, setup scripts for creating Docker environments.

Advanced Usage Patterns

Custom Environment Creation

You can create custom environments by extending the base Docker image:

FROM pwp_env

# Install additional packages
RUN apt-get update && apt-get install -y \
    your-package \
    && rm -rf /var/lib/apt/lists/*

# Add application files
COPY your-app /home/devuser/your-app

# Set up application
RUN cd /home/devuser/your-app && \
    npm install

Adding New Benchmark Tasks

To add a new benchmark task, please follow the detailed instructions in our Contributing Guidelines.

In brief:

Create a directory in pwp_bench/ with your task name
Add a data.jsonl or data.json file with task examples
Create a setup_files directory with setup.py and eval.py
Add your task to task_configs in pwp.bench.benchmark

Agent Development

To create a new agent:

Create a new file in pwp.agents
Implement the agent interface
Add any necessary prompts to pwp.prompts
Add any custom tools to pwp.tools

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Programming with Pixels (PwP) - Package Documentation

Architecture Overview

Module Details

`pwp.env` - Environment Module

Key Features

Example Usage

`pwp.bench` - Benchmark Module

Supported Benchmarks

Example Usage

`pwp.agents` - Agent Module

Available Agents

`pwp.utils` - Utilities Module

Example Usage

`pwp.tools` - Tools Module

Available Tool Categories

`pwp.functions` - Functions Module

`pwp.prompts` - Prompts Module

`pwp.docker` - Docker Configuration

Advanced Usage Patterns

Custom Environment Creation

Adding New Benchmark Tasks

Agent Development

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Programming with Pixels (PwP) - Package Documentation

Architecture Overview

Module Details

pwp.env - Environment Module

Key Features

Example Usage

pwp.bench - Benchmark Module

Supported Benchmarks

Example Usage

pwp.agents - Agent Module

Available Agents

pwp.utils - Utilities Module

Example Usage

pwp.tools - Tools Module

Available Tool Categories

pwp.functions - Functions Module

pwp.prompts - Prompts Module

pwp.docker - Docker Configuration

Advanced Usage Patterns

Custom Environment Creation

Adding New Benchmark Tasks

Agent Development

`pwp.env` - Environment Module

`pwp.bench` - Benchmark Module

`pwp.agents` - Agent Module

`pwp.utils` - Utilities Module

`pwp.tools` - Tools Module

`pwp.functions` - Functions Module

`pwp.prompts` - Prompts Module

`pwp.docker` - Docker Configuration