What Is a SIP Proxy? Architecture, B2BUA vs. Proxy, and Enterprise SIP Interoperability

You’ve got a Cisco PBX in the data center, an Avaya contact center on a separate subnet, and a cloud SIP trunk provider that speaks a slightly different dialect of SIP than either of them. The search for a “SIP proxy” that will make these systems talk to each other reliably is the starting point for a lot of SIP deployments, and it’s often where engineers discover that a lightweight SIP proxy isn’t the right tool for the job.

A SIP proxy, in the strict RFC 3261 sense, is a server that forwards SIP requests toward their destination without fully terminating the call session. It sits in the signaling path, decides where to route requests, and passes them along; it doesn’t renegotiate how the call is set up, rewrite headers for vendor compatibility, or control the media path. In a homogeneous, single-vendor environment, that transparency is exactly what you want. In the multi-vendor, multi-carrier environments that characterize most enterprise and service provider deployments, you need something that can actually intervene.

This page covers the architecture of SIP proxies, the critical differences between a proxy and a Back-to-Back User Agent (B2BUA), where each belongs in a real network, and why the majority of carrier interconnects and enterprise SIP deployments need the full capabilities of a Session Border Controller (SBC).

How a SIP Proxy Works

A SIP proxy is a network intermediary that routes SIP signaling messages between User Agents (UAs, including phones, softclients, PBX systems, and any SIP-speaking endpoint). Its primary job is to resolve a SIP URI to a network address and forward the request one hop closer to the destination.

The proxy adds a Via header to each SIP request as it passes through, recording its address in the signaling path. Responses from the callee travel back through the same chain of Via headers, each proxy stripping its own entry as the response passes. This creates a traceable, reversible signaling route without requiring the proxy to maintain any state about the call itself.

Stateless vs. Stateful Proxies

A stateless proxy processes each SIP message in isolation. It reads the request URI, applies routing logic, forwards the message, and forgets it. There is no transaction state, no call record, no correlation between a request and its response. Stateless proxies are fast and horizontally scalable, but they have no ability to do anything that requires awareness of the call as a whole: no forking, no authentication, no fallback routing.

A stateful proxy tracks SIP transactions (the full exchange of a request and its responses) and optionally tracks the dialog (the complete call from INVITE to BYE). This allows stateful proxies to perform features like parallel forking (sending an INVITE to multiple destinations simultaneously), authenticated re-routing, and rudimentary load distribution. Most production SIP infrastructure uses stateful proxies when proxies are used at all.

What neither stateless nor stateful proxies do: modify the SDP body to change codec preferences, rewrite SIP headers to accommodate vendor incompatibilities, terminate the media path to apply encryption, or hide the internal network topology from external parties.

SIP proxy call flow: signaling routes through the proxy via Via headers while RTP media flows directly between endpoints. The proxy does not touch the media path.

The Forwarding Model: Via, Route, and Record-Route

Three SIP headers define how proxies participate in a session:

Via is the tracing mechanism. Every proxy appends its address (and a unique branch parameter) to the Via header on the way in, and responses strip Via entries in reverse order. This is how a response knows where to go.

Route is a pre-loaded routing instruction: a list of proxy addresses the request must pass through, consumed one by one.

Record-Route is how a stateful proxy inserts itself into the ongoing dialog. By adding itself to the Record-Route header, the proxy ensures it will see all subsequent requests in the same dialog, keeping itself in the signaling path for the life of the call. A proxy that does not add Record-Route will only see the initial INVITE, not the mid-call re-INVITEs or the BYE.

None of these mechanisms give the proxy the ability to change what the request contains. They govern routing, not content.

B2BUA vs. SIP Proxy: Key Architectural Differences

A Back-to-Back User Agent takes a fundamentally different approach. Rather than forwarding requests, a B2BUA terminates the SIP dialog from the caller, acting as a User Agent Server (UAS), and originates a completely new SIP dialog toward the callee, acting as a User Agent Client (UAC). The two legs are fully independent. The B2BUA can change anything: the SIP headers, the SDP offer, the codec list, the transport protocol, the media path.

This architectural difference has far-reaching consequences for what the device can and cannot do.

Capability	SIP Proxy	B2BUA
Topology hiding (IP address concealment)	No	Yes
SIP header rewriting for vendor normalization	No	Yes
Codec renegotiation	No	Yes
Media anchoring and encryption (SRTP)	No	Yes
DTMF format conversion	No	Yes
DoS/DDoS protection at the signaling layer	Limited	Yes
Access control lists (IP/number-based blocking)	Limited	Yes
Call routing based on call content	Limited	Yes
Full session accounting and CDR generation	Limited	Yes

What a B2BUA Can Do That a Proxy Cannot

Because a B2BUA terminates and re-originates, it has complete visibility and control over both signaling legs. This enables:

SIP header manipulation. A B2BUA can add, modify, or remove any SIP header field. In practice, this means normalizing incompatible vendor implementations: converting a Cisco-flavored SIP header format to one an Avaya system accepts, or stripping proprietary headers that a carrier SIP trunk rejects.

Topology hiding. In a SIP proxy, the caller’s SIP headers reflect the real internal network addresses. A B2BUA replaces all internal addressing with its own external address, concealing the internal topology entirely. This is a security requirement for any carrier-facing or internet-facing deployment.

Codec renegotiation. When two endpoints offer incompatible codec lists in their SDP, a B2BUA can modify the SDP offer/answer exchange to broker a codec both sides accept, or invoke transcoding if no common codec exists.

Media anchoring. A B2BUA can place itself in the media path, accepting and forwarding RTP packets, which enables SRTP encryption/decryption, RTP-to-SRTP conversion, media recording, and MOS quality scoring.

The Trade-Off: Complexity vs. Control

A SIP proxy is simpler to deploy and adds lower processing overhead per call, appropriate for high-volume internal routing in a homogeneous environment. A B2BUA adds some processing overhead per call, increases state, and requires more configuration. In return, it gives you the control surface needed to make heterogeneous SIP networks work.

For anything facing the internet or a carrier, the B2BUA model is not optional. The proxy model simply cannot satisfy the security and interoperability requirements.

SIP Normalization: The Problem a Proxy Cannot Solve

The SIP standard (RFC 3261 and its companions) defines the protocol. It does not mandate that every vendor implement every feature the same way. In practice, two systems that both claim SIP compliance will frequently disagree on header formatting, SDP codec ordering, DTMF signaling method, UPDATE vs. re-INVITE behavior, session timer handling, and dozens of other details.

Common Multi-Vendor SIP Incompatibilities

The following are real interoperability problems encountered in production voice networks:

DTMF signaling. Some endpoints send DTMF digits as in-band audio (RFC 2833 / RFC 4733 in the RTP stream). Others use SIP INFO messages. Still others use out-of-band signaling via the SDP telephone-event format. When two systems use different methods, DTMF tones are lost entirely, a problem that destroys IVR functionality, voicemail access, and conferencing.

Codec ordering. An SDP offer lists codecs in priority order. Some vendors require the first codec in the list to be the one they prefer; others choose the highest-quality common codec regardless of position. When a codec order mismatch results in a suboptimal codec being selected, call quality degrades in ways that are hard to diagnose.

Via and Contact header formatting. Certain PBX vendors place extra parameters in Via or Contact headers that downstream SIP trunks reject as malformed. Other vendors strip parameters that are required. Either way, calls fail, or fail intermittently, depending on the call path.

Session timers. RFC 4028 defines session timers to detect hung calls. Not all endpoints support them, and different implementations of Session-Expires values conflict. A B2BUA can normalize timer behavior across both legs.

P-Asserted-Identity and PAI headers. Caller identity headers vary significantly across vendors and carriers. A B2BUA can normalize these for STIR/SHAKEN compatibility, ensuring the correct identity information is presented at the attestation layer.

SIP Header Manipulation Engine

The solution to the above is a SIP header manipulation engine: a rules-based system that defines, for each call leg or trunk group, exactly what headers are added, modified, or removed as calls pass through. The manipulation rules operate on inbound and outbound SIP messages independently, allowing different vendor-specific treatments on each side of the B2BUA.

This is the core normalization function that makes a B2BUA SBC the right device for multi-vendor environments. Without it, every vendor incompatibility requires a firewall rule workaround, an endpoint configuration change on the vendor’s system, or a call that simply doesn’t work.

SBC B2BUA normalization: carrier SIP trunk traffic is fully terminated and re-originated, with header rewriting, codec normalization, and topology hiding applied before delivery to enterprise endpoints.

SIP Proxies in Carrier and Enterprise Architectures

In practice, SIP proxies do appear in production networks, but in specific roles where their limitations are acceptable.

Where SIP proxies are used:

As SIP registrar front-ends, distributing REGISTER requests to a pool of registration servers
As internal load balancers between a cluster of identical application servers in a homogeneous platform (all running the same software, same vendor, same codec profile)
As SIP-based DNS resolution layers in large-scale carrier platforms, where the routing decision is pure address translation with no content modification needed

Where SIP proxies fall short:

Any carrier-to-enterprise interconnect, where the carrier SIP trunk speaks one flavor of SIP and the enterprise PBX speaks another
Any deployment requiring TLS for SIP signaling, which requires terminating the TLS session and re-originating it, exactly what a proxy does not do
Any deployment requiring SRTP media encryption, which requires media anchoring
Any deployment subject to DoS/DDoS exposure: a stateless proxy offers no protection; it will forward flood traffic as readily as legitimate traffic

The Session Border Controller is the device class defined specifically to address these requirements at network edges. An SBC uses the B2BUA architecture internally, and layers security, access control, topology management, and normalization on top of it.

For a detailed breakdown of how SBCs connect to carrier trunks, see the SBC SIP Trunk guide. For a deeper dive into security capabilities at the SIP layer, you can head over to the VoIP Security pillar.

SIP Registration and Authentication

SIP proxies and SBCs handle SIP registration differently, and the difference matters for both security and operations.

A SIP registrar is the entity that accepts REGISTER requests and maps a SIP AOR (Address of Record, a SIP URI like user@domain.com) to a contact address (the current IP address of the endpoint). In classic deployments, the registrar is a distinct server function, often co-located with a proxy.

An SBC with a B2BUA architecture can perform SIP Registration Forwarding: accepting REGISTER requests from endpoints, proxying them upstream to the operator’s registrar, and maintaining the registration mapping locally. This gives the SBC full visibility into the registration state of every endpoint behind it without requiring changes to the upstream registrar.

More importantly, an SBC can perform SIP Registration Scanning Protection: detecting registration flood attacks, automated attempts to brute-force credentials by sending high volumes of REGISTER requests, and blocking them at the edge before they reach the registrar or PBX. A lightweight SIP proxy has no mechanism to distinguish a registration flood from legitimate traffic; it forwards both equally.

These are not edge-case requirements. SIP registration scanning is one of the most common attack vectors against internet-facing SIP infrastructure. Any device sitting at a network edge needs active protection against it.

SIP Proxy in the Microsoft Teams Direct Routing Context

Microsoft Teams Direct Routing is the mechanism that connects Teams Phone users to the public switched telephone network (PSTN) through a customer-managed SIP trunk. Microsoft mandates that the device connecting Teams to the SIP trunk be a certified Session Border Controller (not a SIP proxy).

The reasons are direct. Teams requires:

TLS for SIP signaling on both the Teams-facing and carrier-facing legs. Terminating a TLS session requires the device to act as a TLS endpoint, which a transparent proxy cannot do. A B2BUA SBC terminates TLS on both legs independently.
SRTP for media encryption on the Teams-facing leg. This requires media anchoring, only possible with a B2BUA.
SIP normalization between the Teams SIP dialect and the carrier SIP trunk dialect. Microsoft’s Teams SIP implementation has specific header requirements that most carrier SIP trunks do not natively satisfy.
NAT traversal and topology hiding, ensuring internal network addresses are not leaked into the Teams signaling path.

A SIP proxy satisfies none of these requirements. Microsoft tests SBCs against the full set of Teams SIP requirements: header handling, codec support, DTMF, registration behavior, and TLS/SRTP compliance. For more information on how ProSBC connects to Microsoft Teams Direct Routing, see our product page.

Frequently Asked Questions

What is the difference between a SIP proxy and a Session Border Controller?

A SIP proxy forwards SIP requests without terminating the call session. It cannot modify SIP headers for vendor normalization, enforce media encryption, or hide internal network topology. A Session Border Controller (SBC) uses a B2BUA architecture to fully terminate and re-originate both the signaling and media legs. This gives it complete control over header manipulation, codec negotiation, media encryption (SRTP), topology hiding, DoS/DDoS protection, and access control, all required at any carrier interface or multi-vendor network edge.

Can a SIP proxy handle multi-vendor SIP interoperability?

Lightweight SIP proxies are best suited for homogeneous, single-vendor environments. Multi-vendor interoperability (where a Cisco PBX, an Avaya system, and a carrier SIP trunk all implement SIP headers, DTMF, and codec negotiation differently) requires a B2BUA SBC with a SIP header manipulation engine that can normalize each vendor’s traffic independently.

Does Microsoft Teams Direct Routing require an SBC or a SIP proxy?

Microsoft Teams Direct Routing requires a certified Session Border Controller. A SIP proxy cannot terminate TLS sessions, anchor SRTP media, or perform the SIP normalization between Teams and carrier SIP trunks that Microsoft’s certification program validates. Microsoft maintains a list of certified SBC vendors for Teams Direct Routing.

What is a B2BUA?

A Back-to-Back User Agent (B2BUA) is a SIP entity that acts as both a User Agent Server (UAS) on the incoming leg and a User Agent Client (UAC) on the outgoing leg. It terminates the SIP dialog from the caller and originates a new dialog toward the callee. Unlike a SIP proxy, the B2BUA has full control over both legs, enabling it to rewrite SIP headers, renegotiate codecs, anchor media, and apply encryption independently on each side.

When would you use a SIP proxy instead of an SBC?

A SIP proxy is appropriate for internal routing in a single-vendor environment where all endpoints speak the same SIP dialect and where security boundaries, header normalization, media control, and DoS protection are not required. Any deployment facing the internet, a carrier SIP trunk, or a multi-vendor environment needs the full B2BUA capabilities of an SBC.

Conclusion

SIP proxies are a fundamental component of SIP architecture, but their role is precisely defined. They route requests. They do not terminate sessions, modify content, protect networks, or normalize incompatible implementations. In the environments where those limitations don’t matter (homogeneous internal networks, single-vendor platforms, pure routing tiers), they work well. In the environments that constitute most real voice network deployments, they don’t.

Carrier interconnects, Microsoft Teams Direct Routing, multi-vendor enterprise voice, CPaaS platform integration, and any deployment subject to internet-facing traffic all require the full control surface of a B2BUA Session Border Controller: SIP header manipulation, topology hiding, media anchoring and encryption, DoS/DDoS protection, and the access control to enforce all of it.

Beyond the SIP proxy’s forwarding role, the SBC extends into territory the proxy model cannot reach: the complex, multi-vendor, carrier-facing network edges where normalization, security, and full session control are mandatory.

Ready to Go Beyond the SIP Proxy?

If your network has outgrown the SIP proxy model, or if you’re deploying at a carrier interface, connecting to Microsoft Teams Direct Routing, or working across multiple SIP vendors, the architecture difference is not theoretical. You need a device that can terminate sessions, normalize headers, anchor and encrypt media, and protect the edge.

ProSBC is a carrier-grade, software-based B2BUA Session Border Controller built on over 20 years of SIP deployment experience. It handles up to 60,000 sessions per server and 350,000 endpoint registrations, with a programmable routing layer using Ruby API modules for normalization, STIR/SHAKEN integration, and fraud detection. Subscription pricing starts from $1.25/session/server/year with a 30-day free trial and immediate software download.

Request a Demo
Start Your 30-Day Free Trial