by Emmanuel Proulx
The Session Initiation Protocol (SIP) is a signaling protocol with great significance to the telecommunications industry. This article provides a general and technical introduction to SIP, and shows how SIP is an important enabler of telecommunication solutions.
I once had a great idea for a piece of software that would "float" on top of an application, providing assistance. No, this would not be a dumb "help" system. It would be a live technical support agent, conferenced in over the Internet. At the time I was told "there are no tools, libraries, protocols, or bandwidth for doing that!"
Times have changed.
Many people have broadband at home, over DSL, cable, and other technologies. Good quality tools and libraries abound, both commercial and open source. Standards enable applications. Now is the time for the cool ideas to be executed.
Let me introduce you to SIP, the Session Initiation Protocol. SIP is a lightweight, extensible, request/response protocol for starting communication sessions between two end-points. Does this sound familiar? SIP was inspired by HTTP and SMTP conceptually, although its intent is different. You can compare SIP messages to the CB lingo 10-codes, and Q-signals.
Figure 1. Lingo used to manage a CB call
In this example, the real message is surrounded with special call negotiation messages.
SIP was supplied by the IETF in 1999, and then revised in 2002. It is described in RFC 3261. Much of the information about SIP in this article was distilled from that RFC. Many extensions to SIP exist, and many of these extensions can be found in this list of SIP-related RFCs and drafts.
What are the benefits of SIP? Generally it is used by two end-points to negotiate a "call." By negotiate I mean the medium (text, voice, other), the transport (usually RTP, Real Time Protocol), and the encoding (codec). Once the negotiation is successful, the two end-points use the selected method for talking to each other—independently of SIP. Once the "call" is over, SIP is used to indicate a disconnection. Therefore, SIP is best used as a signaling mechanism. SIP and its extensions also provide related functions such as instant messaging, registration, and presence.
An end-point in the SIP jargon is called a user agent. This could be a "soft phone," an instant messenger, an IP phone, or even a cellular phone. Centralized services are provided by server user agents such as registrars, proxies, or application servers.
SIP sounds very simple, and it is. But while this simplicity is important for the protocol to be stable, it doesn't limit the usefulness of the protocol, which has found a rich set of applications areas.
Think of HTTP for example. The protocol definition by itself is tiny. But the ways to use it are unlimited. SIP is also extensible. Dozens of extensions already exist for SIP that cover a wide range of applications. Let's now take a more in-depth look at SIP and why it's important.
It's been said that what HTTP did for the Web, SIP will do for telecommunications.
SIP has colossal repercussions in the telecommunications industry. Cellular-technology companies have decided to standardize on SIP for all future applications. VoIP (Voice over IP) vendors, Internet telephony, and instant messaging applications (for example, Microsoft MSN Messenger) are all standardizing on SIP.
Some signaling protocols as well as peer-to-peer technology already exist. This begs the question, what advantage does SIP have over them? SIP offers some definite advantages:
This means that if you want your applications to interoperate with other tools, equipment, and servers, SIP is the best way to go. Vendors are serious about interoperability and meet on a regular basis to test their products together. These meetings are called SIPit for SIP Interoperability Tests (previously Bakeoff, which was renamed because Pillsbury sued).
Let's now look more closely at the technology. SIP is usually transported over UDP although TCP must also be supported by SIP tools. A SIP message contains two parts:
The envelope is text, but the content may be text or binary.
As an example, let's dissect a typical SIP call. In this scenario, User A wants to call User B. Figure 2 illustrates this call:
Figure 2. A typical SIP call
All these messages are explained here:
|1. User Agent A sends a SIP request "INVITE" to User Agent B to indicate User A's wish to talk to User B. This request contains the details of the voice streaming protocol. The Session Description Protocol ( SDP) is used in the payload for this purpose. The SDP message contains a list of all media codecs supported by User A. (These codecs are using RTP for transport.)||
INVITE sip:UAB@example.com SIP/2.0 Via: SIP/2.0/UDP 10.20.30.40:5060 From: UserA <sip:UAA@example.com>;tag=589304 To: UserB <sip:UAB@example.com> Call-ID: email@example.com CSeq: 1 INVITE Contact: <sip:UserA@10.20.30.40> Content-Type: application/sdp Content-Length: 141 v=0 o=UserA 2890844526 2890844526 IN IP4 10.20.30.40 s=Session SDP c=IN IP4 10.20.30.40 t=3034423619 0 m=audio 49170 RTP/AVP 0 a=rtpmap:0 PCMU/8000
|2. User Agent B reads the request and tells User Agent A it has been received.||
SIP/2.0 100 Trying From: UserA <sip:UAA@example.com>;tag=589304 To: UserB <sip:UAB@example.com> Call-ID: firstname.lastname@example.org CSeq: 1 INVITE Content-Length: 0
|3. While the phone rings, User Agent B sends provisional messages (ringing) to User Agent A just so it doesn't time out and give up.||
SIP/2.0 180 Ringing From: UserA <sip:UAA@example.com>;tag=589304 To: UserB <sip:UAB@example.com>;tag=314159 Call-ID: email@example.com CSeq: 1 INVITE Content Length: 0
|4. Eventually User B decides to accept the call. At this point User Agent B sends an OK response to User Agent A. In the payload of the response, there's another SDP message. It contains a set of media codecs that are supported by both user agents. At this point both parties are officially in the call. All types of SIP requests are accepted using 200-type responses.||
SIP/2.0 200 OK From: UserA <sip:UAA@example.com>;tag=589304 To: UserB <sip:UAB@example.com>;tag=314159 Call-ID: firstname.lastname@example.org CSeq: 1 INVITE Contact: <sip:UserB@10.20.30.41> Content-Type: application/sdp Content-Length: 140 v=0 o=UserB 2890844527 2890844527 IN IP4 10.20.30.41 s=Session SDP c=IN IP4 10.20.30.41 t=3034423619 0 m=audio 3456 RTP/AVP 0 a=rtpmap:0 PCMU/8000
|5. User Agent A finally confirms with an ACK message. There are no retries and no response messages for this request type, even if the message is lost. ACK is only used in the case of an INVITE message.||
ACK sip:UAB@example.com SIP/2.0 Via: SIP/2.0/UDP 10.20.30.41:5060 Route: <sip:UserB@10.20.30.41> From: UserA <sip:UAA@example.com>;tag=589304 To: UserB <sip:UAB@example.com>;tag=314159 Call-ID: email@example.com CSeq: 1 ACK Content-Length: 0
|6. Both user agents are now connected using the method selected in the last SDP message.||RTP packets of audio data going in both directions over ports 49170 & 3456 using PCMU/8000 encoding.|
|7. At the end of the communication session, one of the users hangs up. At this point this user's user agent sends a new request BYE. This message can be sent by any of the parties.||
BYE sip:UAB@example.com SIP/2.0 Via: SIP/2.0/UDP 10.20.30.41:5060 To: UserB <sip:UAB@example.com>;tag=314159 From: UserA <sip:UAA@example.com>;tag=589304 Call-ID: firstname.lastname@example.org CSeq: 1 BYE Content-Length: 0
|8. The other user's user agent accepts the request and replies with an OK message. The call is disconnected.||
SIP/2.0 200 OK To: UserB <sip:UAB@example.com>;tag=314159 From: UserA <sip:UAA@example.com>;tag=589304 Call-ID: email@example.com CSeq: 1 BYE Content-Length: 0
The first line of a SIP message contains the type of message and the version of SIP used (2.0). In requests, this line also contains an address called the SIP URI. This represents the destination of the message.
This example illustrates the use of request messages INVITE, ACK, and BYE, as well as the 200 OK response message. Many other messages exist in SIP. Here are a few requests:
|INVITE||Call a user agent, transfer a call.|
|ACK||Confirm the call.|
|BYE||End a call.|
|CANCEL||End a call that hasn't been OK'd yet.|
|REGISTER||Provide a registrar service with a contact address and the alias that can be used instead. For example, the address sip:UAA@example.com is an alias for sip:UserA@10.20.30.40 in the previous example. The registrar server example.com can then forward calls for UAA to the address 10.20.30.40.|
|OPTIONS||Ask a user agent for its "capabilities" (for example, messages and codecs it understands).|
Now here are some often-used response messages:
|100 Trying||The message has been received but not processed by the end user agent yet. Please wait.|
|180 Ringing||The message has been received by the end user agent, which is prompting the user. Please wait.|
|200 OK||The message was accepted by the end user.|
|301 Moved Permanently & 302 Moved Temporarily||The address of the user agent has changed; here's the new permanent or temporary address, in the Contact field.|
|400 Bad Request||A generic error message. The client doesn't understand the message.|
|401 Unauthorized & 407 Proxy Authentication Required||Please try again with credentials.|
|404 Not Found||The user you're trying to reach doesn't exist or isn't registered.|
|408 Request Timeout||The other party isn't responding. This means a SIP message was never OK'd. All retries were dropped as well. It doesn't mean that the phone rang for too long (phones can ring forever).|
Messages use similar types of header fields. Here are a few of them:
|From||Sender of the SIP request.|
|To||Receiver of the SIP request. This is often the same as the SIP URI (can be an "alias" or a real address).|
|Contact||Real address of the user agent.|
|Call-ID||No, this isn't the phone number of the caller. It uniquely represents the whole call, or dialog, between the two user agents. All related SIP messages use the same Call-ID. For example, when a user agent receives a BYE message, it knows which call to hang up based on the Call-ID.|
|CSeq||Sequence number of a message. This is unique inside a single dialog or Call-ID. This is used to differentiate between new messages and "retries." Retries happen when an initial message isn't OK'd in time, and are sent at regular intervals.|
|Content-Type||The MIME type for the payload inside the message.|
|Content-Length||Size in bytes of the payload. The envelope and the payload are separated by an empty line.|
Additional headers exist for message-routing-related functions, like Via, Route, and Record-Route. Many headers provide capabilities such as Accept, User-Agent, and Supported. Other headers provide security such as Authorization, Privacy, and WWW-Authenticate. Many more headers exist. Also, many of these fields have a short syntax (for example, From = f, To = t, and so on).
There are many applications that can be implemented with SIP and its extensions:
Basically, if it makes two end-points communicate, SIP can do it.
But what about my super-duper idea of a live over-the-Web technical support agent? Can we do that today with SIP? And can we do that in Java, my favorite language? In short, yes.
I work with SIP a lot. I can safely say that Java offers excellent support for SIP. An assortment of Java technologies, helpful to SIP developers, abstracts away many details associated to developing SIP applications. These are mostly in the JAIN (Java APIs for Integrated Networks) work group:
Other related technologies are:
If you wish to develop a client application, you need a client-side SIP engine, or "stack." A good, open source, Java SIP stack is available here. It also supports SDP. If you don't want to develop your own SIP phone, you can use this one.
This article provides a brief introduction to SIP, the scenarios in which you can use it, and a bit of the SIP syntax. We also glanced at the various Java technologies related to SIP. Although the article isn't detailed, I hope it's enough to kindle your interest and encourage you to start using it. SIP's time has arrived and now plenty of cool ideas can finally be realized.
In Part 2 of this article series, I will show you how to write a chat room application using the SIP Servlet API.
Emmanuel Proulx is an expert in J2EE and SIP. He is a certified WebLogic Server engineer.