Blog · Engineering

Latency-Neutral Batching: Processing Bursts Without Penalizing The First Event

· December 21, 2025 · 6 min read

A pattern for collapsing webhook bursts while preserving immediate response for human-initiated events.

In distributed systems, there is a persistent tension between throughput (batch things for efficiency) and latency (respond immediately for humans). Most batching patterns sacrifice the latter for the former. This article describes a pattern that doesn’t.

We call it Latency-Neutral Batching: a design where burst traffic collapses into fewer operations, but the first event in any burst experiences zero artificial delay.

I. The Problem: Two Traffic Patterns, One System

Webhook-driven systems receive traffic in two distinct modes:

ModeExampleExpectation
Single eventsUser saves an invoice at POSImmediate processing
BurstsCSV import of 50 invoicesEfficient batching

The constraint: a human at a terminal cannot wait 500ms for a debounce timer. The optimization: 50 webhooks should not spawn 50 jobs.

Standard batching patterns force a choice:

  • Debounce: Wait N ms after the last event → delays everything
  • Fixed window: Collect for N ms, then process → delays the first event
  • Queue depth: Batch if queue > 1 → fails when consumers are fast

None of these preserve the invariant: first event, zero delay.

II. The Core Principle: Time vs. Execution State

The insight that unlocks the solution:

Execution state tells you what the system is doing now. Time tells you what to expect next.

Queue depth, worker availability, and job counts are execution signals. They reflect current state, not upstream intent. When consumers process faster than events arrive, execution signals collapse to zero between events—even mid-burst.

Time does not collapse. A 1-second window remains 1 second regardless of processing speed. Therefore:

Burst detection must be time-based, not execution-based.

III. The Architecture: Temporal Lease + Single-Flight Runner

A. The Temporal Lease

On each webhook, set a short-lived Redis key:

SET sync:dirty:{tenant} "1" PX 1200

This key answers one question: Has this tenant been active in the last 1.2 seconds?

Properties:

  • Refreshed on each incoming event
  • Independent of job execution
  • Decays naturally when traffic stops

B. The Single-Flight Runner

One job per tenant, enforced via idempotent job ID:

queue.add('tenant-sync-runner', payload, {
  jobId: `sync-runner_${tenantId}`
});

Properties:

  • At most one runner active per tenant
  • Subsequent enqueues are no-ops
  • Runner owns exclusive execution for the tenant

C. The Execution Loop

The runner operates in an iterative loop:

1. Fetch all work since cursor
2. Process the batch
3. Check if temporal lease still exists
   → Yes: Sleep 200ms, return to step 1
   → No:  Traffic stopped, finalize and exit

IV. Why This Preserves Zero Latency

First event in a burst:

  1. Sets dirty signal
  2. Enqueues runner (no existing job)
  3. Runner starts immediately
  4. Processing begins with no artificial delay

Subsequent events:

  1. Refresh dirty signal (extend TTL)
  2. Runner enqueue is no-op (job already exists)
  3. Runner’s loop absorbs new work in next iteration

The first event never waits. Batching emerges on the tail, not the head.

V. Contrast With Common Patterns

PatternMechanismFirst Event Delay
DebounceWait N ms after last event+N ms
Fixed windowCollect for N ms+0 to +N ms
Queue depthCheck if queue > 1+0 ms (but fails under fast drain)
Latency-NeutralStart immediately, linger+0 ms

VI. The Generalized Pattern

This approach applies wherever you have:

  • Bursty ingress: Webhooks, polling, event streams
  • Mixed latency requirements: Some human-initiated, some bulk
  • Tenant isolation: Work must not cross boundaries
  • Acceptable eventual consistency: Final state matters, not per-event confirmation

The core abstractions:

AbstractionResponsibility
Temporal LeaseTime-bounded burst detection
Single-Flight RunnerAt most one executor per tenant
Iterative SweepLoop until temporal lease expires

VII. Invariants

These properties must hold:

  1. At most one runner per tenant — enforced by idempotent job ID
  2. First event is never delayed — runner starts immediately
  3. No event is dropped — cursor-based sweep catches all changes
  4. Convergence is deterministic — runner exits when lease expires
  5. Ingestion never blocks — webhook handler is async

VIII. When To Use This Pattern

Use latency-neutral batching when:

  • User-facing latency matters (POS, invoicing, payments)
  • Upstream sends unbounded bursts (imports, reconnects, replays)
  • Downstream has rate limits or per-call overhead
  • Simple debouncing would degrade UX

Do not use when:

  • All events can tolerate uniform delay
  • Strict ordering per-event is required
  • The system is purely background/async

By the TaxBridge Engineering Team