A pattern for collapsing webhook bursts while preserving immediate response for human-initiated events.
In distributed systems, there is a persistent tension between throughput (batch things for efficiency) and latency (respond immediately for humans). Most batching patterns sacrifice the latter for the former. This article describes a pattern that doesn’t.
We call it Latency-Neutral Batching: a design where burst traffic collapses into fewer operations, but the first event in any burst experiences zero artificial delay.
I. The Problem: Two Traffic Patterns, One System
Webhook-driven systems receive traffic in two distinct modes:
| Mode | Example | Expectation |
|---|---|---|
| Single events | User saves an invoice at POS | Immediate processing |
| Bursts | CSV import of 50 invoices | Efficient batching |
The constraint: a human at a terminal cannot wait 500ms for a debounce timer. The optimization: 50 webhooks should not spawn 50 jobs.
Standard batching patterns force a choice:
- Debounce: Wait N ms after the last event → delays everything
- Fixed window: Collect for N ms, then process → delays the first event
- Queue depth: Batch if queue > 1 → fails when consumers are fast
None of these preserve the invariant: first event, zero delay.
II. The Core Principle: Time vs. Execution State
The insight that unlocks the solution:
Execution state tells you what the system is doing now. Time tells you what to expect next.
Queue depth, worker availability, and job counts are execution signals. They reflect current state, not upstream intent. When consumers process faster than events arrive, execution signals collapse to zero between events—even mid-burst.
Time does not collapse. A 1-second window remains 1 second regardless of processing speed. Therefore:
Burst detection must be time-based, not execution-based.
III. The Architecture: Temporal Lease + Single-Flight Runner
A. The Temporal Lease
On each webhook, set a short-lived Redis key:
SET sync:dirty:{tenant} "1" PX 1200
This key answers one question: Has this tenant been active in the last 1.2 seconds?
Properties:
- Refreshed on each incoming event
- Independent of job execution
- Decays naturally when traffic stops
B. The Single-Flight Runner
One job per tenant, enforced via idempotent job ID:
queue.add('tenant-sync-runner', payload, {
jobId: `sync-runner_${tenantId}`
});
Properties:
- At most one runner active per tenant
- Subsequent enqueues are no-ops
- Runner owns exclusive execution for the tenant
C. The Execution Loop
The runner operates in an iterative loop:
1. Fetch all work since cursor
2. Process the batch
3. Check if temporal lease still exists
→ Yes: Sleep 200ms, return to step 1
→ No: Traffic stopped, finalize and exit
IV. Why This Preserves Zero Latency
First event in a burst:
- Sets dirty signal
- Enqueues runner (no existing job)
- Runner starts immediately
- Processing begins with no artificial delay
Subsequent events:
- Refresh dirty signal (extend TTL)
- Runner enqueue is no-op (job already exists)
- Runner’s loop absorbs new work in next iteration
The first event never waits. Batching emerges on the tail, not the head.
V. Contrast With Common Patterns
| Pattern | Mechanism | First Event Delay |
|---|---|---|
| Debounce | Wait N ms after last event | +N ms |
| Fixed window | Collect for N ms | +0 to +N ms |
| Queue depth | Check if queue > 1 | +0 ms (but fails under fast drain) |
| Latency-Neutral | Start immediately, linger | +0 ms |
VI. The Generalized Pattern
This approach applies wherever you have:
- Bursty ingress: Webhooks, polling, event streams
- Mixed latency requirements: Some human-initiated, some bulk
- Tenant isolation: Work must not cross boundaries
- Acceptable eventual consistency: Final state matters, not per-event confirmation
The core abstractions:
| Abstraction | Responsibility |
|---|---|
| Temporal Lease | Time-bounded burst detection |
| Single-Flight Runner | At most one executor per tenant |
| Iterative Sweep | Loop until temporal lease expires |
VII. Invariants
These properties must hold:
- At most one runner per tenant — enforced by idempotent job ID
- First event is never delayed — runner starts immediately
- No event is dropped — cursor-based sweep catches all changes
- Convergence is deterministic — runner exits when lease expires
- Ingestion never blocks — webhook handler is async
VIII. When To Use This Pattern
Use latency-neutral batching when:
- User-facing latency matters (POS, invoicing, payments)
- Upstream sends unbounded bursts (imports, reconnects, replays)
- Downstream has rate limits or per-call overhead
- Simple debouncing would degrade UX
Do not use when:
- All events can tolerate uniform delay
- Strict ordering per-event is required
- The system is purely background/async
By the TaxBridge Engineering Team