SIP Call Flow Explained: A Step-by-Step Walkthrough of Every Message

SIP call flow visualized as glowing communication steps showing VoIP message routing and call setup sequence.

Every successful VoIP call is a short sequence of SIP messages exchanged in a strict order. When something breaks (one-way audio, no ringback, dropped calls at the four-minute mark, calls that complete with the wrong codec), the answer is almost always visible in that sequence. Reading a SIP trace is a fundamental skill for anyone designing, deploying, or troubleshooting voice networks, and the first step is knowing exactly what each message looks like, where it goes, and what it carries.

This article walks through every message in a typical SIP call, from the initial INVITE to the closing 200 OK on the BYE, including the SDP offer/answer that negotiates the audio path. We then cover the most common failure flows you will see in production traces (busy, auth challenge, CANCEL during ringing) and the mid-call modifications used for hold, codec changes, and transfer. If you want the conceptual overview first, the companion piece SIP Signaling Fundamentals covers the protocol roles and architecture at a higher level.

Key Terms and Concepts
A quick-reference glossary for terms used throughout this article.
INVITE is the SIP request method that initiates a new session and carries the caller’s session description (SDP) describing the media the caller can send and receive.
100 Trying is a provisional response that confirms the next hop has received the INVITE and is processing it, so the caller stops retransmitting.
180 Ringing is a provisional response indicating that the destination endpoint is alerting the called party (the phone is ringing).
200 OK is the success response to an INVITE that signals the call has been answered and contains the callee’s SDP answer.
ACK is the request that confirms receipt of the 200 OK and completes the three-way INVITE handshake.
BYE is the request used by either side to terminate an established session.
CANCEL is the request a caller sends to abandon an INVITE that has not yet been answered, typically after 180 Ringing.
SDP (Session Description Protocol) is the body format inside SIP messages that describes media types, codecs, IP addresses, and ports for the audio or video stream.
Via header records the path a request traveled so responses can find their way back through the same hops in reverse.
From and To headers identify the originating and target parties of a dialog; tags appended to these headers uniquely identify the call leg.
Call-ID is a globally unique string that identifies a single SIP dialog across every message in the call.
CSeq is a sequence number plus method name that orders requests within a dialog and prevents out-of-order processing.
RTP (Real-time Transport Protocol) is the protocol that carries the actual audio packets, running separately from SIP on the ports negotiated in the SDP.
re-INVITE is a second INVITE sent inside an established dialog to modify the session (hold, codec change, media address change).
B2BUA (Back-to-Back User Agent) is a network element that terminates the incoming SIP call and originates an independent outgoing call, splitting the dialog into two legs it controls fully.

The Three Phases of a SIP Call

Every SIP call moves through three phases, and every message you will see in a trace belongs to one of them. The phases are useful because they map directly to what should be happening on the wire at a given point.

Setup covers the messages that establish the dialog: INVITE, the provisional responses (100, 180, sometimes 183 with early media), the final 200 OK, and the ACK that closes the handshake. Roughly 90% of call problems show up in this phase.

Media exchange is the period after ACK during which RTP packets flow directly between the endpoints (or through a B2BUA that anchors media). No SIP messages are exchanged during this phase unless something changes about the session.

Teardown closes the dialog with a BYE from either side and a 200 OK confirming receipt. Once both messages are exchanged, the dialog no longer exists and any further messages with the same Call-ID will be rejected.

Phase 1: Call Setup, Message by Message

This is where the call is born. Alice (at company.com) is calling Bob (at provider.com). We will look at each message on the wire, header by header, so you can match what you see in a trace to what the protocol is actually doing.

Step 1: The INVITE

Alice’s phone builds an INVITE and sends it toward Bob’s domain. The first line is the request line, naming the method (INVITE), the target URI, and the SIP version. The headers that follow identify the parties, the dialog, the path, and the body.

INVITE sip:bob@provider.com SIP/2.0
Via: SIP/2.0/UDP 192.0.2.10:5060;branch=z9hG4bK776asdhds
Max-Forwards: 70
From: "Alice" ;tag=1928301774
To: "Bob"
Call-ID: a84b4c76e66710@192.0.2.10
CSeq: 314159 INVITE
Contact:
Content-Type: application/sdp
Content-Length: 142

Several things are worth noticing. The From header has a tag generated by Alice’s phone; the To header does not yet have a tag because the dialog is not yet established (Bob’s UAS will add a To-tag on the response). The Via branch parameter starts with z9hG4bK, which is the magic cookie that marks the message as RFC 3261-compliant. The Call-ID stays constant for the life of the dialog. The CSeq starts at an arbitrary integer and increments with each new request method within the dialog.

Step 2: 100 Trying

The next hop responds almost immediately with a provisional 100 Trying. This is a hop-by-hop acknowledgment, not an end-to-end one, and its only purpose is to stop Alice’s phone from retransmitting the INVITE while the next hop does its work.

SIP/2.0 100 Trying
Via: SIP/2.0/UDP 192.0.2.10:5060;branch=z9hG4bK776asdhds
From: "Alice" ;tag=1928301774
To: "Bob"
Call-ID: a84b4c76e66710@192.0.2.10
CSeq: 314159 INVITE
Content-Length: 0

Notice that the Via, From, To, Call-ID, and CSeq are copied from the request. SIP routing depends on this consistency: the response travels back through the same Via chain in reverse, and matching headers tie the response to the request it answers. A 100 Trying never reaches the caller’s screen, and there is no ACK for any 1xx response.

Step 3: 180 Ringing

Once the INVITE reaches Bob’s phone, the device begins alerting (the phone rings) and sends back a 180 Ringing. This is the message that triggers the ringback tone on Alice’s side.

SIP/2.0 180 Ringing
Via: SIP/2.0/UDP 192.0.2.10:5060;branch=z9hG4bK776asdhds
From: "Alice" ;tag=1928301774
To: "Bob" ;tag=314259
Call-ID: a84b4c76e66710@192.0.2.10
CSeq: 314159 INVITE
Contact:
Content-Length: 0

The To header now has a tag added by Bob’s UAS (tag=314259). From this point forward, the combination of Call-ID, From-tag, and To-tag uniquely identifies the dialog. A variant of this step is 183 Session Progress, which carries an SDP answer and is used to deliver early media (in-band ringback or announcements from the network before the call is answered).

Step 4: 200 OK

When Bob picks up, his phone sends a 200 OK with its own SDP answer. This is the most important message in the entire flow: it accepts the call and locks in the media parameters.

SIP/2.0 200 OK
Via: SIP/2.0/UDP 192.0.2.10:5060;branch=z9hG4bK776asdhds
From: "Alice" ;tag=1928301774
To: "Bob" ;tag=314259
Call-ID: a84b4c76e66710@192.0.2.10
CSeq: 314159 INVITE
Contact:
Content-Type: application/sdp
Content-Length: 139

v=0
o=bob 2890844527 2890844527 IN IP4 198.51.100.20
s=-
c=IN IP4 198.51.100.20
t=0 0
m=audio 49170 RTP/AVP 0 8
a=rtpmap:0 PCMU/8000

The body after the blank line is the SDP. The m=audio line declares that Bob will receive audio on UDP port 49170 and supports codecs 0 (PCMU) and 8 (PCMA). The c= line gives the IP address. Alice’s phone will start sending RTP to 198.51.100.20:49170 as soon as it processes this message.

Step 5: ACK

Alice’s phone confirms the 200 OK with an ACK request. The ACK is unique in SIP because it is a request that closes a transaction rather than opening one.

ACK sip:bob@198.51.100.20:5060 SIP/2.0
Via: SIP/2.0/UDP 192.0.2.10:5060;branch=z9hG4bK4321
Max-Forwards: 70
From: "Alice" ;tag=1928301774
To: "Bob" ;tag=314259
Call-ID: a84b4c76e66710@192.0.2.10
CSeq: 314159 ACK
Content-Length: 0

Two details are worth flagging. The CSeq number stays at 314159 (it matches the INVITE), but the method changes to ACK. The Request-URI is now Bob’s Contact address from the 200 OK rather than the original sip:bob@provider.com, because the ACK travels directly to Bob’s endpoint, bypassing proxies that may have been in the original path. The dialog is now fully established.

The SDP Offer/Answer Model

SIP only negotiates the call. The actual media uses RTP, and the parameters of that RTP session are negotiated inside the SDP bodies carried by SIP messages. The mechanism is called offer/answer, defined in RFC 3264.

Alice’s INVITE carries an SDP offer listing every codec her phone supports, in preference order. Bob’s 200 OK carries an SDP answer listing only the codecs he is willing to use, in the order he prefers, plus the IP address and port where he wants to receive RTP. The intersection of those two lists is the codec the call will actually use.

A simple offer might look like this:

m=audio 49172 RTP/AVP 0 8 9 18
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:9 G722/8000
a=rtpmap:18 G729/8000
a=sendrecv

Alice is offering G.711 µ-law (0), G.711 A-law (8), G.722 (9), and G.729 (18). Bob’s answer narrows that to one codec, typically the highest-priority option in Bob’s own preference list that is also in Alice’s offer. The a=sendrecv attribute indicates the media is bidirectional; other values include sendonly, recvonly, and inactive, which become important during call hold.

Two common failures live in the SDP. The first is no overlap between offered codec lists, which produces a 488 Not Acceptable Here response and the call never connects. The second is a Network Address Translation (NAT) problem where the c= address inside SDP is a private IP that the other side cannot reach, producing a connected call with no audio in one or both directions. An SBC fixes both by normalizing the SDP before forwarding it.

Phase 2: Media Flowing Over RTP

Once the ACK is sent, the SIP dialog goes quiet and RTP takes over. RTP packets are small UDP datagrams carrying 20 ms of encoded audio each, flowing every 20 ms in each direction. For an established 30-minute call with no modifications, you can expect roughly 180,000 RTP packets in total and exactly zero SIP messages.

RTP runs alongside RTCP (RTP Control Protocol), which carries quality reports on a separate port (usually RTP port + 1). RTCP packets are how endpoints exchange jitter, packet loss, and round-trip delay statistics. Many SBCs use RTCP data to drive MOS scoring and call-quality alerts even though they do not generate the audio themselves.

One thing that catches new engineers off guard is that RTP is bearer-only and has no concept of the SIP dialog. If a phone stops receiving RTP, it has no SIP-level way to know whether the other side has hung up, frozen, or simply lost network connectivity. This is why most user agents implement a session timer (RFC 4028): they periodically send a re-INVITE or UPDATE during long calls to confirm the dialog is still alive. If the refresh fails, the side that detected the failure tears the call down with a BYE.

Phase 3: Call Teardown with BYE

When either party hangs up, that side sends a BYE within the same dialog.

BYE sip:bob@198.51.100.20:5060 SIP/2.0
Via: SIP/2.0/UDP 192.0.2.10:5060;branch=z9hG4bK998
Max-Forwards: 70
From: "Alice" ;tag=1928301774
To: "Bob" ;tag=314259
Call-ID: a84b4c76e66710@192.0.2.10
CSeq: 314160 BYE
Content-Length: 0

The CSeq has advanced by one (now 314160) because BYE is a new request within the dialog. The other side replies with 200 OK confirming the BYE. RTP stops in both directions, and the dialog state is destroyed on both endpoints. Any subsequent SIP message bearing this Call-ID will produce a 481 Call/Transaction Does Not Exist.

SIP call flow ladder diagram showing Caller to SBC to Callee with INVITE, 100 Trying, 180 Ringing, 200 OK with SDP, ACK, RTP media, BYE, and 200 OK in sequence

Click to enlarge.

When the Call Doesn’t Connect: Failure Scenarios

Most real-world calls succeed, but when they fail, the failure mode is usually one of four common patterns. Recognizing the response code on the wire tells you immediately what to look at.

Authentication Challenge: 401 or 407

If the next hop requires authentication (a SIP trunking provider, an enterprise PBX with registration enforcement), it answers the first INVITE with a 401 Unauthorized (when the endpoint itself challenges) or 407 Proxy Authentication Required (when an intermediary proxy challenges). The challenge includes a WWW-Authenticate or Proxy-Authenticate header carrying a realm and a nonce.

The caller’s user agent then sends a second INVITE with the same Call-ID but an incremented CSeq, carrying an Authorization header containing a digest computed from the nonce, the URI, and the shared secret. If the digest checks out, the flow continues normally with 100 Trying, 180 Ringing, and 200 OK. If it fails, a 403 Forbidden ends the dialog. Mismatched realm strings between the SBC and the upstream provider are one of the most common causes of “registration works but calls fail” tickets.

Busy: 486

A 486 Busy Here means the called endpoint is on another call and cannot accept this one. The dialog ends immediately, and the caller’s user agent typically renders a busy tone. A 600 Busy Everywhere indicates the user is globally unavailable, which terminates fork attempts at any forwarding proxy.

No Answer: 408 or 480

If Bob never picks up, the call ends one of two ways. A 480 Temporarily Unavailable is sent by Bob’s phone itself when the ringing timer expires (often after 30 to 60 seconds). A 408 Request Timeout is generated by the network if no response of any kind comes back within Timer B (default 32 seconds for UDP). Each tells a different story: 480 means the call reached the endpoint and was rejected by the endpoint’s policy; 408 means the call never got a usable response from anything downstream.

CANCEL: Caller Hangs Up While Ringing

If Alice hangs up while she still hears ringback (before Bob’s phone answers), her user agent sends a CANCEL referencing the same branch parameter as the original INVITE. CANCEL is a request, not a response. Bob’s side responds with two messages: a 200 OK to the CANCEL itself, and a 487 Request Terminated to the original INVITE. Alice’s user agent acknowledges the 487 with an ACK, and the dialog is destroyed before it was ever fully established. Misreading a CANCEL flow as a failed call is a frequent source of incorrect ASR metrics in raw CDR analysis.

Mid-Call Changes: re-INVITE, UPDATE, and REFER

Once the dialog is established, the call is not frozen. Either party can send a new request inside the existing dialog to modify the session. Three methods do most of the work.

A re-INVITE is the workhorse. It uses the same Call-ID, From-tag, and To-tag as the original INVITE but carries a fresh SDP offer. The most common use is call hold: the holding party sends a re-INVITE with the media line modified (a=sendonly, or the connection IP set to 0.0.0.0 in older implementations), the held party returns a 200 OK with the matching answer, and audio stops in one direction until a second re-INVITE restores it. Re-INVITEs are also used for codec renegotiation (switching from G.711 to G.729 when network conditions degrade) and for the SIP switch from voice to T.38 when a fax is detected.

UPDATE is similar to re-INVITE but designed to modify the session before it is fully established (during the early dialog state after a 180 Ringing). It is heavily used by some carriers to update session timers and SDP details without waiting for the call to be answered.

REFER implements call transfer. The transferring party sends a REFER to the other side containing a Refer-To header naming the transfer target. The receiving side sends a NOTIFY back as the transfer progresses, indicating whether the new call connected. Both attended transfer (consult first, then transfer) and blind transfer (transfer immediately) use REFER; the difference is just whether the transferer has already established a second dialog with the target.

What an SBC Does at Each Step

An SBC sits in the middle of this flow as a B2BUA, which means the call you see on one side of it is not the same SIP dialog as the call on the other side. To Alice, the SBC looks like Bob; to Bob, the SBC looks like Alice. Two independent dialogs exist, each with its own Call-ID, From-tag, To-tag, and CSeq counter, glued together by the SBC’s internal routing logic.

This architecture changes what every message in the flow can do. On the INVITE, the SBC can rewrite headers to match the downstream system’s expectations (this is where SIP header manipulation lives), strip private IP addresses from Via and Contact for topology hiding, and apply rate limiting or fraud rules before forwarding. On the 200 OK, the SBC rewrites the SDP so that media flows through the SBC rather than directly between endpoints, which gives it the ability to transcode codecs (G.711 to G.729, or AMR to G.711 across mobile-to-fixed interconnects), enforce SRTP encryption, and apply quality monitoring to every RTP packet.

On failure flows, the SBC can map response codes between dialects: a downstream 503 Service Unavailable might be translated to a 486 Busy Here upstream to keep retry behavior reasonable. On re-INVITEs for hold or codec changes, the SBC can choose to pass them through, terminate them on its own side and absorb the change, or trigger transcoding adjustments. On a CANCEL, the SBC propagates the cancellation downstream so the called party stops alerting, then cleans up both legs.

The practical result is that a properly configured SBC removes most of the variance you would otherwise see between SIP implementations. Two vendors that cannot peer directly will peer through the SBC, because the SBC normalizes the call flow on each leg to match what that leg expects.

Conclusion

The SIP call flow is short, deterministic, and visible. Five messages take a call from dial to talk (INVITE, 100 Trying, 180 Ringing, 200 OK, ACK), two messages tear it down (BYE, 200 OK), and a handful of failure responses cover almost every real-world problem. Inside the bodies, SDP offer/answer handles the media negotiation, and the audio itself flows over RTP on a separate channel that SIP never touches.

Knowing what each message carries, in what order, and what changes between request and response is the difference between guessing at a SIP trace and reading it. For engineers integrating SIP trunks, troubleshooting mid-call failures, or designing multi-vendor interconnects, this fluency is the foundation everything else builds on.

How ProSBC Handles Every Message in the Flow

ProSBC is a true B2BUA, which means every SIP message you have just read about passes through a programmable engine before it leaves the SBC. INVITEs can be rewritten by header manipulation rules to fix vendor mismatches; SDP bodies can be modified to anchor media, force a single codec, or convert between RTP and SRTP; failure responses can be remapped on the fly to normalize behavior across upstream providers.

The same programmable layer drives STIR/SHAKEN signing and verification on the INVITE, applies dynamic blacklisting before the call is accepted, and exposes a SBC API for integration with billing, fraud, and CRM systems. For carrier interconnects, Microsoft Teams Direct Routing, and multi-vendor enterprise voice, this control over every step of the call flow is what makes B2BUA architecture worth choosing over a SIP proxy.

Prefer to evaluate on your own first? Start your 30-day free trial.