SMS Webhooks and Delivery Tracking: Events, Retries, and Storage

The API response is not the delivery result

When your send request succeeds, you usually know one thing: the messaging platform accepted your request. Delivery is asynchronous. The carrier may accept, delay, filter, expire, reject, or report a status later. Your application needs webhooks because the truth changes after the initial API response.

Twilio describes outbound status callbacks as a way to track status changes through the message lifecycle. Vonage notes that a successful SMS API response means a message was queued, not necessarily delivered, and that delivery receipts vary in reliability by market and receipt type.

If your database only has sent = true, your support team has no delivery tracking. It has a guess.

Normalize provider events into your own model

Every provider has its own payload names, status vocabulary, retry behavior, and error codes. Store the raw payload, but do not make the rest of your product depend on raw provider fields. Normalize events into a compact lifecycle that your app understands.

Normalized status	Typical provider meaning	Final?
accepted	API request accepted or message queued.	No
sent	Message handed to the downstream network or messaging channel.	No
delivered	Provider or carrier received a successful delivery receipt.	Usually
undelivered	Delivery receipt says the handset or destination was not reached.	Yes
failed	Provider could not send, route, or process the message.	Yes
expired	Carrier retry window ended before delivery.	Yes
rejected	Carrier, provider, or policy rejected the message.	Yes
unknown	No useful final state is available.	Maybe

Do not assume events arrive exactly once or in perfect order. Webhook systems retry, networks fail, and providers can add fields over time. Design your event processor to be idempotent and tolerant of extra payload data.

The records worth storing

You need two levels of storage: the current message summary for quick reads, and an append-only event timeline for audit and debugging. The summary powers dashboards and product state. The timeline explains how the message got there.

Field	Where	Why
message_id	messages	Stable internal ID used by your app.
provider_message_id	messages	Lets you reconcile with provider logs and support.
recipient_hash	messages	Useful for debugging without exposing phone numbers broadly.
destination_country	messages	Delivery behavior and rules vary heavily by country.
sender_identity	messages	Separates 10DLC, toll-free, short code, sender ID, or route behavior.
template_key	messages	Helps detect template-specific filtering or copy problems.
current_status	messages	Fast product reads and support filtering.
event_id	message_events	Webhook deduplication when the provider supplies a unique ID.
status	message_events	The lifecycle state from each callback.
error_code	message_events	Debugging, alerting, and provider support escalation.
raw_payload	message_events	Future-proof audit trail when mappings change.
received_at	message_events	Your system time, separate from provider event time.

create table sms_messages (
  id text primary key,
  provider_message_id text,
  recipient_hash text not null,
  destination_country text,
  sender_identity text,
  template_key text,
  current_status text not null,
  created_at timestamptz not null,
  updated_at timestamptz not null
);

create table sms_message_events (
  id text primary key,
  message_id text not null references sms_messages(id),
  provider_event_id text,
  status text not null,
  error_code text,
  provider_occurred_at timestamptz,
  raw_payload jsonb not null,
  received_at timestamptz not null
);

Build the webhook handler like an ingestion pipeline

Twilio's webhook security docs recommend HTTPS and signature validation, and warn that webhook parameters can evolve. That is the right shape for any provider integration: validate authenticity, preserve the raw request, map only the fields you understand, and do not break when new fields appear.

Receive the raw request body before any middleware mutates it.
Verify the provider signature using the exact URL, headers, and raw body or form parameters required by that provider.
Reject invalid signatures before writing state.
Map the provider message ID and status into your normalized lifecycle.
Deduplicate by provider event ID when available, otherwise by provider message ID, status, and provider timestamp.
Write the event and update the current summary in one transaction.
Return 2xx only after the event is safely stored.
Send unknown statuses to a dead-letter or review queue instead of dropping them.

async function handleSmsWebhook(request: Request) {
  const rawBody = await request.text();
  const signature = request.headers.get("x-provider-signature");

  if (!verifyWebhookSignature({ rawBody, signature, url: request.url })) {
    return new Response("invalid signature", { status: 401 });
  }

  const event = parseProviderPayload(rawBody);
  const normalized = normalizeSmsEvent(event);

  await db.transaction(async (tx) => {
    await tx.smsMessageEvents.upsert({
      providerEventId: normalized.providerEventId,
      messageId: normalized.messageId,
      status: normalized.status,
      errorCode: normalized.errorCode,
      rawPayload: event
    });

    await tx.smsMessages.updateCurrentStatus(normalized.messageId, normalized.status);
  });

  return new Response("ok", { status: 200 });
}

Handle duplicates and out-of-order events

Webhook delivery is usually at-least-once. That means duplicate callbacks are normal. If your handler increments counters, sends user notifications, or triggers fallbacks on every callback without deduplication, one provider retry can become a product bug.

Out-of-order events are just as important. You might receive accepted after delivered, or a delayed intermediate state after a final state. Keep an explicit status precedence model so old intermediate events do not downgrade a message that already reached a final state.

Problem	Bad behavior	Better behavior
Duplicate delivered event	Send the user two success notifications.	Upsert by event ID and make side effects idempotent.
Late sent event after failed	Change the message back to in progress.	Store the event but keep the final summary state.
Unknown provider status	Throw 500 forever or drop it silently.	Store raw payload, mark unknown, and alert engineering.
Provider outage	Lose callbacks during downtime.	Return non-2xx only when storage failed and rely on provider retries plus reconciliation jobs.

Give support a real delivery timeline

The best delivery tracking work shows up in support. A support agent should be able to answer: when did the user request the SMS, what number was used, which sender identity sent it, what did the provider say, what did the carrier say, did we retry, and what should the user do next?

Show timestamps in the user's local timezone and UTC.
Mask phone numbers by default, with audited reveal access for trusted support roles.
Translate provider errors into plain-language support notes.
Link every message to the product action that triggered it.
Expose country, sender identity, provider, and template so patterns are visible.
Make it easy to copy a provider message ID for escalation.

This is also where product analytics becomes useful. If OTP completion drops in one country, your webhook data can show whether users are failing to request codes, carriers are rejecting messages, or receipts are delayed.

SMS webhook FAQ

Should webhook handlers return 200 immediately?

Return 2xx only after you have validated and durably stored the event. If storage fails, a non-2xx response lets the sender retry instead of losing the event.

Do I need to store raw webhook payloads?

Yes. Store them with access controls. Raw payloads help when provider fields change, mappings are wrong, support escalates, or you need to reconcile with provider logs.

Can I trust delivered as a final truth?

Use delivered as the best available delivery signal, but avoid wording that guarantees a human saw the message. Providers document cases where delivery receipt certainty varies.

Should I poll instead of using webhooks?

Polling can be useful for reconciliation, but webhooks should be the primary path for timely delivery updates. A nightly reconciliation job can catch missed or inconsistent events.

SMS Webhooks and Delivery Tracking: What to Store and Why