Update: The information in this post became obsolete over the time. Recently we’ve got RCS Universal Profile standard, which defines Green Button Promise for Voice and Green Button Promise for IP Video Call Services and Enriched Voice Calling. I’m working on an updated material, which will compare RCS IP Calls and VoLTE/VoWifi calling and describe RCS Network Architecture.
It is similar with many OTT applications for calls and massaging. Which one is the right or at least decent one? The big advantage is that usually we can try it for free. Anyway what is the difference between a voice call performed as VoLTE (or VoWiFi) and voice call which is transmitted via RCS?
The 4G networks were originally designed for pure data traffic. But the voice service is a must have feature and right now it can’t be fully replaced by VoIP. The voice service has to be reliable and fully compatible with the service from 3G (Blacklisting, Call diverting, Lawful intercept, Emergency calls, DTFM, etc.) and has to allow seamless fallback in case the 4G coverage is lost (eSRVCC). This new service is called VoLTE.
Sometimes people wonder why it takes so much time and effort to make the RCS working when LINE, Whatsapp, Viber, Skype, Lync and others just work and they are (nearly) for free. One of the reasons is the enormous amount of standards and specifications for SIP/IMS/RCS. Just take a look on RFC 4480 where we can find ridiculous definitions like:
Activities such as <appointment>, <breakfast>, <dinner>, <holiday>, <lunch>, <meal>, <meeting>, <performance>, <travel>, or <vacation> can often be derived from calendar information.
dinner: The person is having his or her main meal of the day, eaten in the evening or at midday.
holiday: This is a scheduled national or local holiday.
in-transit: The person is riding in a vehicle, such as a car, but not steering. The <place-type> element provides more specific information about the type of conveyance the person is using.
looking-for-work: The presentity is looking for (paid) work.
lunch: The person is eating his or her midday meal.
“RCS standardized itself to death” – I’m not sure who said it first but there is definitely something about it 🙂
We have already seen a bit of presence in the post Is the Presence Social? This time we’ll extend our framework with other nodes from OMA specification.
There are also some other network elements and interfaces but for now it is more than enough. Just think about it – we need all these servers and databases for a simple information whether is our friend present or not and maybe some tag line or avatar. No wonder that many OTTs use either very simple or proprietary presence solution. Even the RCS doesn’t rely on the full presence implementation and only SIP OPTIONS support is mandatory.
You have seen a couple of IMS flows. You know the registration, the 3rd party registration. Flows of SIP INVITE, SDP, SIP MESSAGE, RTP and RTCP, MSRP … do you think you know enough? For the core functionality this is nearly all we need. But for a real service there are many other things we have to learn. Yes in theory there is the Signaling and Media Plane. But there is also a parallel world of Presence and Capabilities. (And some other parallel worlds of billing, provisioning, monitoring, etc.)
The cornerstone mechanism for all the VoLTE/ViLTE, Instant Messaging and particularly for the RCS is a capability discovery. Users want to see their contacts with the RCS services that are available to communicate. This can be implemented either using the SIP OPTIONS or using a Presence-based solution defined in RCS Release 1 -4. Both will result in one of three types of response:
The contact is registered for service resulting in the contacts current service capabilities
being received and logged, or,
the contact is not registered (they are provisioned but not registered),
the contact is not found (they are not provisioned for service).
This discovery mechanism is important since it ensures that users can determine what services are available before the communication starts. The same mechanisms can be used to initially discover (and/or periodically check) the service capabilities of all the contacts within an address book when we first register for the service. For the Service Providers it is also very useful because they can add new types of communication channels without compatibility issues.
Capabilities of a device are shared during SIP Registration, over SIP OPTIONS and using the presence system.
SIP OPTIONS for VoLTE
In practice in VoLTE networks we can find the SIP OPTIONS messages to be used as application-layer pings or heart-beats. Via them we can monitor an availability of IMS network servers (instead of SNMP traps for example). E.g. S-CSCF monitors Applications Servers using SIP OPTIONS. If TAS is able to respond with 200 OK, it means it is up and running and not overloaded.
Moreover in VoLTE/ViLTE the SIP OTIONS can be also used for capabilities sharing. The IR.94 says that a Contact header field in a 200 OK response message to a SIP OPTIONS request must include the IMS Communication Service Identifier (ICSI) value of “urn:urn-7:3gppservice.ims.icsi.mmtel”, as defined in 3GPP TS 24.173. In addition for ViLTE, a contact header in a 200 OK response message to a SIP OPTIONS request must include a “video” media feature tag.
SIP OPTIONS for Presence and Capability Discovery
In this post we’ll address mainly the SIP OPTIONS presence/capabilities discovery and basic presence system. The SIP OPTIONS method is send as end-to-end message. It is used both to query the capabilities (services which the other user has available) of the target contact and to pass the information about which capabilities are supported by the requester. Using this method, both users get updated information in a single transaction.
Being a trainer is a nice job. But one shouldn’t use an instant messenger (Skype, Lync, Google Hangout, Jabber, etc.). I always forget to switch it off or change my presence to ‘off-work’ (sometimes does’t help it either). For example it is 9 pm, I’m tired after two weeks of training at a hotel room and watching some movie on my laptop. Then suddenly a window with my IM jumps out and one of my friends (sitting in a different timezone) is asking for some technical advice. He/She sees I’m online and active so it’s not twice polite to ignore the request. Sure, no need to be a trainer, I guess you know it as well.
Presence indication is one of the key attributes of Instant Messaging or Real Time Communication in general. RCS 5.2 is defining the social presence with following attributes:
Availability, indicates the user‟s (un)willingness to communicate,
Portrait icon, depicting the user (e.g. a photo or image provided by the contact himself),
Free text, including textual note and possibility to add emoticons (automatic translation
of some specific characters into smileys),
Favourite link, to publish hypertext link of personal and/or favourite site,
Timestamp, date of the last update of the profile, generated automatically,
Geolocation, depicts the user location.
We shouldn’t limit ourselves only to them. Stop for a while to think about Real Time Communication as about something dedicated to human beings only. For the Machine to Machine communication (M2M/M2X/IoT) it is the presence important maybe even more.
For example a distributor of gas has smart monitoring devices which are all connected to a monitoring station. Moreover some of the devices are interconnected among themselves. Their “social” presence can be related to their readiness, actual throughput, power supply, strength of radio signal, reliability and condition. (However it is true that SIP/IMS/RCS were designed primarily for humans and there are also other ways how to solve presence for M2M.)
UEs and Application Servers rely on the presence information very much. It can be important for both signaling and data. Based on the presence we can choose which device to use, which access method is the best, what timeouts are optimal, which codecs are applicable etc.
Anyway before we’ll dive in the RCS concept of the social presence we should recap the standard mechanisms for SIP defined in the RFC 3903, RFC 2778 and RFC 3856 and the Presence Service for IMS defined in 3GPP 24.141.
There are main principles and there are small technical details. People mostly understand how a petrol engine works. But that doesn’t mean they are able to fix one or even make it. No, it is a long way to understand how things really work. And it is a psychologically proven fact that we can’t really understand things until we will try to change them. Well, don’t try it at home with your petrol engine.
This time we will take a closer look on one of often omitted interfaces. You don’t need it in order to understand the general idea of IMS but when we go into a detail it can be very important.
The Sh interface is defined by 3GPP in TS 29.329 as a Diameter interface between an AS and HSS.
I remember when we were learning the limits in math analysis:
Definition Let (xn) be a sequence of real numbers. The sequence (xn) is said to converge to a real number a, if for all ε>0, there exists N in N such that |xn–a|<ε for all n≥N.
Take your time. Human brain simply have a problem to process more than two variables. Three is sometimes too much.
When I tell during a training that each dialog is identified by the triple Call-ID, From-tag and To-tag sometimes people start to experience the same kind of discomfort. So what are these headers for?
Call-ID (described in RFC 3261) is identifying a session. So in a common session the Call-ID will be the same in INVITE, PRACK, REFER, ACK, BYE and all the responses. Let’s say someone wants to establish a video chat with me. I have several active devices. Then I’ll receive the INVITE on each of them and they will contain the same Call-ID.
In SIP any dialog is possible between two end points – UAC/UAS only. If more parties are involved we can’t call it a dialog anymore (true :)). That means that the dialog is a subset of a session for particular UAs. And that is actually the reason why we need to add the tags into the identification of the dialog.
The SIP client adds the From-tag into the request. The recipient than adds the To-tag in the response. This helps to better identify the originator and recipient.
In case of the recipient the main reason is forking. Simply the recipient can use more devices and the one which will send 200 OK as the first (including its own To-tag) will be the one which will continue in the dialog.
In case of originator it can be the situation that originator is in a dialog with itself (because of testing purposes or so-called “hairpinning” of calls in PSTN gateways) and needs to distinguish between the originating and terminating end.
The identifiers have to be unique (across the time and space 🙂 ). Besides the requirement for global uniqueness, the algorithm for generating a tag is implementation-specific. Tags are also helpful in fault tolerant systems, where a dialog is to be recovered on an alternate server after a failure.
And last but not least it also explains another famous triple – INVITE, 200 OK, ACK. In the call flow above we can see why a 200 OK is not enough to start a media session. The most obvious reason for ACK is that the link is unreliable.
But it can also happen that more devices send 200 OK and the responding UAS can’t be sure that the 200 OK will arrive to UAC as the first one. So only the first UAC will receive the ACK and can start RTP session.
Mind that if any B2BUA is involved it can change Call-ID value and we need to follow two dialogs (UAC-B2BUA, B2BUA-UAS).
Authors of SIP and IMS were maybe very smart. But definitely they were not operation engineers. In practice it is not easy to trace all the messages which belong to a particular flow. Mostly we use the P-Asserted-Identity (and its equivalents such as X-XCAP-Asserted-Indentity or x-3gpp-asserted-identity) and we incrementally add new conditions in the filter. That’s why it is also kind of difficult to find a really good trace tools and many operators create them on their own.
Very funny. Even in 2015 we still rely in many ways on the old good SMS. No, operators are not making the money on the service as they were used to in the past. However it is not easy for them to fully replace the service (e.g. by RCS). That’s why the VoLTE standard GSMA IR.92 says:
The UE must implement the roles of an SM-over-IP sender and an SM-over-IP receiver, the IMS core network must take the role of an IP-SM-GW.
Update: SMS is about to stay even in 5G. So much for RCS replacing SMS.
In other words, the VoLTE network has to support the (legacy) SMS sent over IP (SIP). The VoLTE phone will receive a common (binary) SMS and the native client will display this message as any other. The only difference is that this time the SMS is sent from an IMS network over SIP protocol. Mind the purpose is not just to support common text messages, but also to support OTA messaging for (U)SIM provisioning, SMS ‘non-text’ applications or Message Waiting Indication for Voice Mail.
The network functionality which provides messaging service in the IMS network is called IP-SM Gateway (IP-SM-GW) and from the IMS point of view it is an Application Server. IP-SM-GW Call flows originate or terminate in LTE network and IP-SM-GW forwards the messages towards/from legacy SMSC, which is still located in CS network.
Originally the core network elements supported services. SMS service was one of them and very important one. The application server is called SMSC.
The main functionality of SMSC is to “Store-and-forward”. Basically the SMSC receives an SMS (MO-FWSM), stores it and acknowledges it back. Then it tries to deliver it. For that it needs to receive the routing information from HLR. If the delivery is not successful the message is scheduled for a retry.
The routing of MT message is done based on the information received from HLR. So firstly based on the message prefix we will route Send-Routing-Information-request (SRI-req) to a responsible HLR. HLR takes a look in its tables and finds out what MSC is currently handling the recipient. The address (Global Title) is returned as Network-Node-Number (NNN). It is possible to return both MSC and SGSN address and the preference how to deliver the message.
There some more operations as AlertSC and RMDS which have to be supported. Alert-Service-Centre message is used to trigger the SMSC to deliver messages of previously Absent Subscribers. Report-SM-Delivery-Status is sent by SMSC to update the information about subscriber in HLR. Anyway both the architecture and the massage flows are much simpler than in case of IMS.
In IMS we are used to apply services for both – originator and recipient. When we look at the original SMSC flow, we see that we can apply the services only for the originator. Around 2006 mobile operators realized that they are loosing a big money and came a concept of homerouting. (Actually technically it was present for a few years already as so called Foreign Subscriber Gateway – FSG.)
The idea was simple. Instead of direct delivery to the recipient’s MSC, the SMSC of the originator (SMSC-A) will forward the message to the recipient’s SMSC (SMSC-B). SMSC-B will apply the services for the recipient and will try to deliver the Short Message.
This should be done in a transactional mode and the SMSC-A is still responsible for the retries. That’s because the SMSC-A needs to know the delivery result. Hence it can generate the notification ‘delivered/deleted’.
Note, that the SMSC-B acts – from the SMSC-A point of view – as both, HLR and MSC. That means that the GT of SMSC-B has to be preconfigured for SRIs on the SMSC-A.
Btw. The homerouting can introduce very nice loops in the network (which some operators intentionally misused ;)).
Not all the operators use this call flow. Also mainly in the North America mobile operators prefer SMPP protocol instead of Sigtran in case of transfer between networks (SMSC-A to SMSC-B).
In the 4G of networks we reuse the homerouting scenario and the role of SMSC-B is played by IP-SM-GW.
The last time we went through the registration in the IMS. During the registration S-CSCF (the SIP Server handling a particular subscriber) authenticates the subscriber, learns her current point of presence, capabilities of her device, etc. But when it comes to multimedia sessions, S-CSCF can offer only some basic functions such as session routing or session management. For VoLTE,VoWifi or RCS service profiles we want to apply the service logic provided by Application Servers (TAS, RCS, IPSMGW..). The purpose of the Third party registration is to let the ASs know that the user is now connected and ready to communicate.
We ended up in the situation when a registrar (S-CSCF) authenticated the user. After the successful authentication S-CSCF downloads a user’s profile from the HSS. The profile contains information about what Application Servers (ASs) shall be triggered on behalf of the subscriber. The information is stored in a form of initial filter criteria (iFC).
Registration is something what was missing in the times of wires. Ok, not completely but it was done more or less once when an MSISDN was assigned to some line.
In contrast to the previous types of the networks the 4G network is ‘user-centric’. It means the user can use multiple devices and identities and we have to deal with it. The main purpose of the registration is to create a binding between user (her public identity) and IP address of the device, so we know where we can send the data to. That’s why there is a Contact header in the SIP REGISTER message. The Contact header contains an address which identifies the current location of the user (Point-of-Presence – e.g. IP/FQDN of a particular client). During the registration is the Contact Address is linked to a Public Identity (IMPU, AOR in SIP terminology). The IMPU is an equivalent to MSISDN in GSM and has to be present in the To header of the SIP REGISTER. One identity can by used by more terminals and as each terminal can have different capabilities this has to be also taken into account. This information is part of the Contact header.