How Webhooks Work: A Technical Deep Dive

Published Feb 21 202612 min read
Technical diagram showing how webhooks work with HTTP POST requests and event-driven architecture

If you have read the complete guide to webhooks, you know what webhooks are and why they matter. This guide goes deeper. We will examine the internal architecture of webhook systems — how events are captured, payloads are constructed, deliveries are dispatched, retries are handled, and how the entire pipeline scales to millions of events. Understanding these internals makes you a better webhook consumer and helps you build more reliable integrations.

The HTTP POST Request: Foundation of Every Webhook

At its core, every webhook is an HTTP POST request. Understanding the mechanics of this request is essential to understanding how webhooks work.

The Request Lifecycle

When a webhook provider sends you an event, here is exactly what happens at the network level:

1

DNS Resolution

The provider resolves your webhook URL's domain name to an IP address. This is why your endpoint needs a valid DNS record — and why DNS propagation delays can cause webhook delivery failures after domain changes.

2

TLS Handshake

If your endpoint uses HTTPS (and it should), the provider and your server establish an encrypted connection. The provider verifies your server's TLS certificate, and they negotiate encryption parameters. This typically takes 1-3 round trips.

3

HTTP Request Transmission

The provider sends the HTTP POST request containing the method, path, headers, and body:

POST /webhook/endpoint HTTP/1.1
Host: your-app.com
Content-Type: application/json
Content-Length: 342
X-Webhook-ID: evt_abc123
X-Webhook-Timestamp: 1708523400
X-Webhook-Signature: v1=a1b2c3d4e5f6...
User-Agent: WebhookProvider/2.0

{"event":"payment.completed","data":{"id":"pay_789","amount":4999}}
4

Response

Your server receives the request, processes it (or queues it for processing), and returns an HTTP response. The provider typically expects a 2xx status code within 5-30 seconds.

Webhook Headers in Detail

Headers carry critical metadata about the webhook delivery. Here are the headers you will encounter:

Standard HTTP headers:

  • Content-Type — almost always application/json; occasionally application/xml or application/x-www-form-urlencoded
  • Content-Length — size of the request body in bytes
  • User-Agent — identifies the webhook provider (e.g., Stripe/1.0, GitHub-Hookshot/abc123)

Provider-specific headers:

  • X-Webhook-ID or X-Request-ID — unique identifier for this delivery attempt
  • X-Webhook-Timestamp — when the webhook was sent (Unix timestamp or ISO 8601)
  • X-Webhook-Signature — cryptographic signature for verification
  • X-Webhook-Event — the event type that triggered this delivery

Different providers use different header naming conventions. Stripe uses Stripe-Signature, GitHub uses X-Hub-Signature-256, and Shopify uses X-Shopify-Hmac-Sha256. The Standard Webhooks specification aims to unify these under webhook-id, webhook-timestamp, and webhook-signature.

Event-Driven Architecture Behind Webhooks

Webhook providers do not just fire HTTP requests directly from their application code. Production webhook systems are built on event-driven architectures with multiple layers of reliability.

Event Generation Layer

The first layer captures events as they happen within the provider's system. When a customer makes a payment, the payment processing code emits an event:

// Inside the provider's payment processing code
async function processPayment(paymentIntent) {
  const result = await chargeCard(paymentIntent);

  if (result.success) {
    // Emit the event to the event bus
    await eventBus.emit({
      type: 'payment_intent.succeeded',
      data: paymentIntent,
      timestamp: Date.now(),
      idempotencyKey: `pi_${paymentIntent.id}_succeeded`
    });
  }
}

This event is published to a message broker (Kafka, RabbitMQ, Amazon SQS, or a similar system), not sent directly as a webhook. This decoupling is essential for reliability.

Event Routing Layer

The routing layer determines which webhook subscriptions should receive each event. A single event might need to go to multiple endpoints — for example, if you have subscribed to payment_intent.succeeded on three different webhook endpoints.

Event: payment_intent.succeeded
  ├── Subscription 1: https://app-a.com/webhook  → Queue delivery
  ├── Subscription 2: https://app-b.com/webhook  → Queue delivery
  └── Subscription 3: https://webhookify.app/wh/xyz → Queue delivery

The router filters events based on subscription configuration (event types, endpoint URLs, active/inactive status) and creates individual delivery jobs for each matching subscription.

Delivery Layer

The delivery layer is where HTTP requests are actually sent. Each delivery job is processed by a worker that:

  1. Constructs the HTTP request (headers, payload, signature)
  2. Sends the request to the subscriber's URL
  3. Waits for a response
  4. Records the result (success or failure)
  5. Schedules a retry if the delivery failed
┌──────────────────────────────────────────────────────────────┐
│                   Event Occurs                                │
│            (e.g., payment succeeds)                           │
└─────────────────────┬────────────────────────────────────────┘
                      │
                      ▼
┌──────────────────────────────────────────────────────────────┐
│              Event Published to Message Queue                 │
│           (Kafka / SQS / RabbitMQ / Redis)                   │
└─────────────────────┬────────────────────────────────────────┘
                      │
                      ▼
┌──────────────────────────────────────────────────────────────┐
│             Event Router / Fan-out                            │
│    Matches event to active webhook subscriptions             │
└─────┬──────────────┬──────────────┬──────────────────────────┘
      │              │              │
      ▼              ▼              ▼
┌──────────┐  ┌──────────┐  ┌──────────┐
│ Delivery │  │ Delivery │  │ Delivery │
│ Worker 1 │  │ Worker 2 │  │ Worker 3 │
└────┬─────┘  └────┬─────┘  └────┬─────┘
     │              │              │
     ▼              ▼              ▼
  HTTP POST      HTTP POST      HTTP POST
  to App A       to App B       to Webhookify

Delivery Guarantees: At-Least-Once vs Exactly-Once

Understanding delivery semantics is critical for building reliable webhook consumers.

At-Least-Once Delivery

Most webhook providers guarantee at-least-once delivery. This means every event will be delivered at least one time, but it might be delivered more than once. Duplicates occur due to:

  • Network timeouts where the provider cannot confirm if you received the request
  • Provider-side retries triggered by ambiguous failures
  • Infrastructure failures causing the delivery system to replay events

At-least-once delivery is the practical standard because it is achievable without complex distributed transaction systems.

Exactly-Once Processing

While exactly-once delivery is theoretically impossible in distributed systems (due to the Two Generals' Problem), you can achieve exactly-once processing on the consumer side through idempotency:

async function handleWebhook(event) {
  // Check if we have already processed this event
  const alreadyProcessed = await redis.get(`webhook:processed:${event.id}`);
  if (alreadyProcessed) {
    return { status: 'duplicate', skipped: true };
  }

  // Process the event
  await processEvent(event);

  // Mark as processed with a TTL (e.g., 7 days)
  await redis.set(`webhook:processed:${event.id}`, '1', 'EX', 604800);

  return { status: 'processed' };
}

At-least-once delivery means your webhook handler WILL receive duplicates. This is not a bug — it is a design choice that prioritizes reliability over uniqueness. Always implement idempotency in your webhook consumers.

Retry Mechanisms in Detail

When a webhook delivery fails, the provider's retry system takes over. Here is how retry mechanisms work internally.

What Triggers a Retry

A delivery is considered failed when:

  • The connection times out (typically 5-30 second timeout)
  • DNS resolution fails
  • TLS handshake fails
  • The server returns a non-2xx status code (4xx or 5xx)
  • The connection is refused or reset

Note: a 410 Gone response typically tells the provider to disable the subscription entirely, not retry.

Exponential Backoff

Most providers use exponential backoff for retries, increasing the delay between attempts:

Attempt 1: Immediate (original delivery)
Attempt 2: 1 minute later
Attempt 3: 5 minutes later
Attempt 4: 30 minutes later
Attempt 5: 2 hours later
Attempt 6: 8 hours later
Attempt 7: 24 hours later

This pattern prevents overwhelming a recovering endpoint with a flood of retries. Some providers add random jitter to the backoff intervals to prevent thundering herd problems when many deliveries fail simultaneously.

Retry Budget and Dead Letter Queues

After exhausting all retry attempts (typically over 24-72 hours), the provider must decide what to do with undeliverable events:

  • Dead Letter Queue (DLQ): The event is moved to a separate queue for manual inspection and replay. This is the most robust approach.
  • Drop and Alert: The event is discarded and the webhook subscription owner is notified. Some providers disable the subscription after too many failures.
  • API Recovery: Some providers (like Stripe) allow you to list missed events via API, enabling manual recovery.

Payload Construction

When a provider constructs a webhook payload, several decisions affect what you receive.

Event Envelope

Most providers wrap event data in an envelope that includes metadata:

{
  "id": "evt_1OqR3x2eZvKYlo2C",
  "type": "payment_intent.succeeded",
  "created": 1708523400,
  "api_version": "2026-01-15",
  "data": {
    "object": {
      "id": "pi_3OqR3x2eZvKYlo2C",
      "amount": 2000,
      "currency": "usd",
      "status": "succeeded"
    }
  }
}

The envelope provides:

  • Event ID — unique identifier for deduplication
  • Event type — what happened, for routing to the correct handler
  • Timestamp — when the event occurred
  • API version — which version of the data schema is being used
  • Data — the actual event payload

Thin vs Fat Payloads

Providers choose between two payload strategies:

Fat payloads include all event data directly in the webhook body. You have everything you need to process the event without making additional API calls:

{
  "event": "order.created",
  "data": {
    "id": "ord_123",
    "customer": { "id": "cus_456", "name": "Jane Doe", "email": "jane@example.com" },
    "items": [{ "name": "Widget", "quantity": 2, "price": 29.99 }],
    "total": 59.98,
    "shipping_address": { "street": "123 Main St", "city": "Springfield" }
  }
}

Thin payloads include only the event type and a reference ID. You must call the API to get the full data:

{
  "event": "order.created",
  "data": {
    "id": "ord_123"
  }
}

Fat payloads are more convenient but can contain stale data if the resource changed between the event firing and your processing. Thin payloads require an extra API call but always give you the current state. See webhook payload formats for more details on parsing different formats.

Queuing Systems Behind the Scenes

Production webhook systems process millions of events daily. The queuing infrastructure makes this possible.

Message Queue Architecture

Modern webhook providers use message queues to decouple event generation from delivery:

Apache Kafka is popular for high-throughput systems. Events are written to topic partitions and consumed by delivery workers. Kafka's log-based architecture provides excellent durability and replay capability.

Amazon SQS / Google Cloud Pub/Sub provides managed queuing with built-in retry and dead letter queue support. Many providers on AWS use SQS for its simplicity and reliability.

Redis Streams or BullMQ are used by smaller-scale systems. Redis provides low-latency queuing with support for consumer groups and acknowledgments.

Worker Pool Design

Delivery workers are typically stateless processes that:

  1. Pull a delivery job from the queue
  2. Build the HTTP request
  3. Send it to the endpoint
  4. Record the result
  5. Acknowledge the job (or re-queue for retry on failure)
// Simplified webhook delivery worker
async function deliveryWorker() {
  while (true) {
    const job = await queue.dequeue('webhook-deliveries');

    const { endpointUrl, payload, secret, attemptNumber } = job;

    // Construct signed request
    const timestamp = Math.floor(Date.now() / 1000);
    const signature = computeHMAC(secret, `${timestamp}.${JSON.stringify(payload)}`);

    try {
      const response = await fetch(endpointUrl, {
        method: 'POST',
        headers: {
          'Content-Type': 'application/json',
          'X-Webhook-Timestamp': timestamp.toString(),
          'X-Webhook-Signature': `v1=${signature}`,
          'X-Webhook-ID': payload.id
        },
        body: JSON.stringify(payload),
        signal: AbortSignal.timeout(30000) // 30 second timeout
      });

      if (response.ok) {
        await queue.acknowledge(job);
        await recordDelivery(job, 'success', response.status);
      } else {
        await scheduleRetry(job, attemptNumber + 1);
        await recordDelivery(job, 'failed', response.status);
      }
    } catch (error) {
      await scheduleRetry(job, attemptNumber + 1);
      await recordDelivery(job, 'error', error.message);
    }
  }
}

Workers are scaled horizontally — during peak load, more workers are added to process the delivery queue faster. During quiet periods, workers are scaled down to save resources.

When you receive a webhook, respond with HTTP 200 as fast as possible. The delivery worker is waiting for your response, and a slow response ties up the worker, reducing the provider's delivery throughput. Accept the payload, respond immediately, and process asynchronously. See our error handling guide for the queue-then-acknowledge pattern.

Ordering and Concurrency

Webhook events are not guaranteed to arrive in order. Here is why:

Out-of-Order Delivery

Consider two events that happen in quick succession:

  1. order.created at timestamp T1
  2. order.updated at timestamp T2 (T2 > T1)

If these events are dispatched by different delivery workers, the order.updated webhook might arrive before order.created due to network latency differences, server processing times, or retry timing.

Handling Out-of-Order Events

There are several strategies for dealing with out-of-order delivery:

Timestamp-based ordering: Store the event timestamp and only process events that are newer than the last processed event for that resource:

async function handleOrderEvent(event) {
  const lastProcessedAt = await db.getLastEventTimestamp(event.data.order_id);

  if (lastProcessedAt && event.created <= lastProcessedAt) {
    // This event is older than one we already processed — skip it
    return;
  }

  await processOrderEvent(event);
  await db.setLastEventTimestamp(event.data.order_id, event.created);
}

State reconciliation: Instead of applying incremental changes, fetch the current state from the API after receiving any event:

async function handleOrderEvent(event) {
  // Ignore the webhook payload details
  // Fetch the current state instead
  const currentOrder = await api.getOrder(event.data.order_id);
  await syncLocalOrder(currentOrder);
}

Event sourcing: Store all events in order and reconstruct state by replaying them. This gives you a complete audit trail and handles out-of-order delivery naturally.

Connection Management and Performance

The way webhook connections are managed affects both delivery speed and reliability.

Connection Pooling

Webhook providers typically maintain connection pools to frequently-contacted endpoints. HTTP keep-alive connections avoid the overhead of repeated TCP handshakes and TLS negotiations, significantly improving delivery throughput.

Timeout Handling

Providers enforce strict timeouts to prevent delivery workers from being tied up by slow endpoints:

  • Connection timeout: 5-10 seconds to establish the TCP connection
  • Response timeout: 10-30 seconds to receive the complete response
  • Total timeout: 30-60 seconds for the entire delivery attempt

If your endpoint consistently approaches these limits, you risk intermittent delivery failures. Always respond quickly and process asynchronously.

Concurrent Deliveries

Providers may send multiple webhooks to your endpoint simultaneously. If you receive a burst of events, you might see 10 or 50 concurrent POST requests hitting your server at once. Ensure your endpoint can handle concurrent requests without race conditions:

// Use database transactions or distributed locks for critical operations
async function handlePaymentWebhook(event) {
  await db.transaction(async (tx) => {
    const order = await tx.getOrderForUpdate(event.data.order_id); // Row-level lock
    if (order.status === 'paid') return; // Already processed
    await tx.updateOrderStatus(order.id, 'paid');
    await tx.createFulfillment(order);
  });
}

Monitoring Webhook Delivery Health

Understanding how webhooks work internally helps you monitor their health effectively. Key metrics to track:

  • Delivery success rate — percentage of webhooks that receive a 2xx response on the first attempt
  • Average response time — how quickly your endpoint responds
  • Retry rate — percentage of deliveries that require retries
  • Event processing latency — time from event occurrence to successful processing
  • Error distribution — breakdown of failure reasons (timeout, 5xx, connection refused)

Webhookify provides comprehensive delivery monitoring out of the box. Every webhook hitting your Webhookify endpoint is logged with full request details, timing metrics, and response status. AI-powered analysis identifies patterns in failures, and real-time alerts via Telegram, Discord, Slack, email, or push notifications ensure you catch issues immediately. The mobile app even plays a cash register sound for payment events — so you literally hear your business working.

See Your Webhooks in Action

Create instant webhook endpoints with full request logging, delivery monitoring, and AI-powered insights. Understand exactly what is happening with every webhook delivery.

Start Monitoring Free

Key Takeaways

Understanding how webhooks work under the hood helps you build better integrations:

  1. Webhooks are HTTP POST requests — they use the same protocol as REST APIs, just in the opposite direction.
  2. Production webhook systems use message queues — events are not sent synchronously. They are queued and delivered by worker pools.
  3. At-least-once delivery is the standard — always implement idempotency in your handlers.
  4. Retries use exponential backoff — failed deliveries are retried with increasing delays.
  5. Events can arrive out of order — design your handlers to be order-independent or implement ordering logic.
  6. Respond fast — acknowledge receipt within seconds and process asynchronously.
  7. Verify signatures — always verify the authenticity of incoming webhooks.

Further Reading

Related Articles

Frequently Asked Questions

How Webhooks Work: A Technical Deep Dive - Webhookify | Webhookify