Search This Blog

Wednesday, July 15, 2015

A very basic SIP introduction

Having been using SIP for awhile now, I like to think I know enough to troubleshoot and fix the problems that arise.  Knowing when it is a provider issue and when it is the layer 8 issue (the connection between the keyboard and the floor), yes that was a joke.  I am by no means a SIP master like the Verizon guy on the Cisco forums but I still know enough to be able to tell the difference between a TX and RX and why the From and To headers don't change in relation to the message.  I figured I would post a few things here that I think are either common knowledge or something you need to know if you are getting into SIP.

To start things off, I would very highly recommend that you read RFC 3261.  It is basically a tutorial for SIP and assumes no knowledge of what SIP is and what it can do.  You can click here to read the RFC at your leisure.  I know that the RFC is long, but it does provide some good insight as to how this protocol works, the definitions, and has a very good description of the headers.

In my opinion SIP in its basic form is an evolved H.323 protocol.  The next level so to speak.  SIP essentially takes all the craziness in H.323 and makes it much more legible and easier to deal with.  Gone are the days of needing to decode the hex of H.323 and read through the TCS (Terminal Capabilities Set) if you migrate to a full SIP network.  The only problem right now is that SIP is not everywhere just yet.  Some ITSPs (Internet Telephony Service Provider) don't yet provide this service or are still new to the game.  I have had the pleasure of working both H.323 over the years as well as SIP and MGCP.  If SIP is available, why are you not using the technology? 

If you are familiar with H.323 I am going to draw some comparisons to help generate a picture that helps in understanding the SIP concept.  In H.323 you have your slow start and fast start.  Basically, slow start deals with sending the H.245 separately after the H.225 call control and signalling setup.  H.225 is basically your layer 2 while H.245 is your media negotiations.  H.323 fast start takes the H.245 and embeds it within the H.225 call setup message.  This allows the remote endpoint to choose fast start if it is capable and negotiate the media faster.

With SIP, the same concept exists.  SIP has delayed-offer and early-offer.  SIP also has something called early-media but we will get into that in another post.  With SIP early offer, call setup proceeds like normal and the terminating end sends it's capabilities to us, the calling party.  We then send a message back showing what we want to use and, in a perfect world, everything works out.  Early-offer takes the capabilities of us, the caller and sends them in the initial call setup message.  This allows the remote end to pick what it wants and send it back to us.

By now, if you know anything about SIP you are probably wondering why I haven't included any terminology.  The reason behind this is that I want those with little to no experience in SIP to grasp the concept before I start throwing out tech jargon that they may or may not understand.  SIP is a very easy protocol to grasp but a very difficult one to master.

When dealing with SIP messages, the best thing about them is that they are all in plain text and follow the HTTP protocol.  You will see things like 404 Not Found when something bad happens as well as a myriad of other messages both good and evil.  SIP has three main parts:
  1. The Start Line
    1. This includes the intial  portion (i.e. INVITE, 200 OK, PRACK, etc.)
  2. The Header
    1. Things like Via, From, To, Contact, Max-Hops, Call-ID, etc.
  3. The SDP
    1. This is basically the media negotation portion.  You will see codecs, DTMF, IP addresses, p-times, bit-rates, etc.
Given those three main parts, they are all completely necessary to setup a call.  I will go over each of the three parts in summary before I go into detail in a later post.  If I went into detail here, this post would be a mile long, much like it almost already is!

The start line with something like a INVITE is one of the first things you would see on a debug or call trace from RTMT.  This start line indicates what is going on with a call leg or call at that point in time.  Below are a list of common start lines:
  1. INVITE
    1. This is typically a request to setup a connection or call
  2. 200 OK
    1. This is an acknowledgement to a request such as the INVITE above
    2. Note that this can contain SDP (more on that later)
  3. PRACK
    1. This is a provision acknowledgement, hence PRACK
    2. This basically gurantees that a provisional message, like INVITE is acknowledged
  4.  BYE
    1. Call teardown in its simplest form
  5. CANCEL
    1. Generally means a request was canceled mid-negotation/setup
    2. Note that this is different than a BYE as this was initiated to cancel a call/request
  6. Ringing
    1. Provisional response to let the requesting station know the phone is "ringing"
  7. Trying
    1. Provisional response to show the INVITE was sent and the distant end is "trying" to set up the call
  8. REFER
    1. This is essentially a gateway transferring the call elsewhere for one reason or another
The above are the most common ones, there are of course others out there.  Again, I didn't go into detail as I want to save that for another post.  This is purely for an introduction into SIP.  Headers in SIP are also in plain text and mean what they are shown as.  See below for a small list of headers:
  1.  Via
  2. From
  3. To
  4. Contact
  5. Max-Hops
  6. Expires
  7. Cseq
  8. Call-ID
  9. Content-Length
These will also be covered in greater detail.  For now, just become familiar and try to think about what they could mean.  Finally, SIP has an SDP element in which it uses to negotiate media.  For instance, if I want to speak G.711u-law, I would advertise PCMU/8000.  This is Pulse Code Modulation Unit 8000.  Basically 8kHz sample with 8 bits per sample for a total of 64kbps in bandwidth, seem familiar?  There is also PCMA/8000 for G.711a-law.  So, I would offer this in my SDP.  I could also offer G.729 and iLBC or whatever other codecs I wanted as well.  Basically, the SIP will negotiate the best quality codec that it can support.  See below for a better illustration of this.


Basically, with humor I described how SIP works with media negotiations.  I just substituted codecs for languages since that analogy seems a bit easier to understand.  One person says "I can speak these codecs" while the other says "I have this codec out of that list".  Keeping it simple stupid, that is how it works.  I will make another post and take things to another level on this later.

I hope that this very very VERY basic SIP introduction has been helpful.  I try to break things down to a third grade level because it makes understanding things easier as well as more entertaining.  Reading a Cisco book is great but can be (usually is), very dry and your head ends up on the keyboard or desk. 


No comments:

Post a Comment