Skip to content

Refactor: Implement batch import processing across all import tools #16

@ewanc26

Description

@ewanc26

Proposal: Transition to Batch Record Processing

To resolve the systemic rate-limiting issues identified in #13, all data ingestion tools within this monorepo must be refactored to utilize the com.atproto.repo.applyWrites endpoint instead of sequential createRecord calls.

Architectural Plan:

  1. Batch Aggregation: Implement a buffering layer to group records into batches of 50-100 operations.
  2. Atomic Transactions: Use applyWrites to commit these batches in single XRPC transactions, reducing network round-trips and PDS overhead.
  3. Dynamic Rate-Limit Awareness: Monitor ratelimit-remaining headers to adaptively throttle the ingestion process, preventing service lockouts for the user.
  4. MST Optimization: Leverage single-root computation per batch to improve PDS performance.

This refactor will first target the core malachite ingestion engine before expanding to other croft.click utilities.

Associated Pull Requests:

  • Branch: refactor/batch-import-optimization

References:

  • Bitcoin Taproot Scaling Analysis (signature aggregation as analogy for batch optimization)
  • AT Protocol: com.atproto.repo.applyWrites specification

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions