Real-Time Text

By: Arnoud van Wijk

Date: October 7, 2008

line break image

When we want to communicate electronically, most of us use voice and, at an increasing rate, video. When we do, such communications occur in real time (See ref 1); that means that we send and receive audio and video continuously as we communicate, and we consider this as the normal way to converse with each other.

Real-time devices
To the deaf or hard of hearing, real-time textfeels more like conversation

For most of us, text is a static medium. We use it to read newspapers and Web sites; we exchange text messages by using mobile phones; and we use instant messaging on our computers to communicate with each other while doing other tasks. When we need efficient conversation, we pick up the phone and call the person.

But what do we do if we’re unable to use a telephone because we can’t hear or speak or because we’re in an environment or a situation where the use of voice is inappropriate, such as in a restaurant or during a meeting? What if we find ourselves in danger and need to contact the police without being heard?

The solution is real-time text. For the majority, this will be a valuable additional communication medium besides audio and video. Real-time text can be used either as the only communication mode or together with audio and video, which is called total conversation (See ref 2).

For those who are deaf or hard of hearing or who have a speech impairment, real-time text is an essential communication capability. This is especially true in the current era. Every day, we all depend more and more on the telephone for immediate contact. Social contacts are maintained, business is conducted, and even our safety depends on the telephone. With real-time text as a mainstream communication feature, Internet telephony is for everyone.

The Technology Behind Real-Time Text

Real-time text works by sending and receiving text on a character-by-character basis. The characters are sent immediately (within a fraction of a second) once typed and are then displayed immediately to the receiving person(s). This allows text to be used in the same conversational manner as voice. It’s like talking by using text.

Real-time text that runs over IP networks is designed around the ITU T.140 real-time text presentation layer protocol. T.140 allows real-time editing of text, even in cases of backspacing and retyping. T.140 is based on the ISO 10646-1 character set, which is used by most IP text specifications, and it uses the UTF-8 format. This allows any language to be used with real-time text, in-cluding English, Chinese, and Russian.

Real-time text uses the same real-time transport protocol (RTP) as voice over IP (VoIP) and video over IP. The text is encoded according to IETF RFC 4103 (RTP Payload for Text Conversation), which supports an optional error-correction scheme based on redundant transmission (as described in RFC 2198). This results in a very low end-to-end character loss across IP networks that have moderately high packet loss. (It also makes it very good for wireless accesses.)

To improve efficiency, the text is buffered for 300-500 milliseconds before it is sent while still meeting the real-time text performance requirements of RFC 4103. The traffic load of real-time text at 30 characters per second is between 2 and 3 kilobits per second depending on the language used (including the overheads for RFC 4103 with the maximum level of redundancy, RTP, UDP and IP).

Real-time text uses the standard session initiation protocol (SIP) (RFC 3261) and the session description protocol (SDP) (RFC 4566). SIP is used without any alteration; there is no difference between real-time text and VoIP for SIP. The real-time text encoding is identified by using the SDP media definition m=text.

To ensure proper technical implementation and use of real-time text, RFC 5194 (see ref 3) lists the essential requirements for real-time text and defines a framework for implementation of all required functions based on SIP and RTP. This includes interworking between real-time text and existing text telephony on the public switched telephone network and other networks.

The ECRIT IETF working group defines real-time text as one medium in the access to emergency services (see RFC 5012, draft-ietf-ecrit-phonebcp and draft-ietf-ecrit-framework). With the growing number of people with hearing and/or speech impairments, it would be prudent for multimedia emergency public-safety answering points in Europe (which uses 112) as well as in the United States and Canada (which use 911) to support real-time text.

Mainstream Use of Real-Time Text

The use of real-time text will not be limited to those who cannot use speech. Like captioning on TV, real-time text will be used more by people without disabilities than by those who have them. On phone calls, people can type in phone numbers, addresses, names, and other information better passed in text than dictated-especially when the two communicators have different accents. When using one line or otherwise occupied in a conference, a person can answer a second line in text only and receive a quick message or have a quick text conversation. When talking to an elderly parent, for example, a person can use text to supplement voice to make sure important information has been understood. In interactions with an interactive voice response system, instead of having to wait while the voice slowly reads out all the choices, real-time text can provide an almost instant list of the choices visually so users can immediately read and select the number they want to press. These and a myriad of other uses will become common as real-time text gets deployed as a natural and always present parallel communication mode on any voice phone call.

The Real-Time Text Taskforce

Launched on 30 July 2008, the Real-Time Text Taskforce (R3TF) is an independent open forum for engineers, motivated individuals, experts, companies, and organizations that wish to help test, implement, and advance the widespread adoption of the real-time text framework (see ref 4). Its goal is to ensure that real-time text is as readily available as voice for all users. The Internet Society is assisting in the effort by serving as an incubator of the R3TF.

Having a single real-time text standard that is used everywhere would make access to communication services easier, and it would eliminate any potential interworking issues. Unfortunately, with the diverse types of communication networks and devices that are in use today, this is not possible.

The R3TF will promote real-time text as the real-time text standard that most terminals and networks can either use native or easily interconnect with via gateways between different network borders. This means that alternative real-time text protocols may be used, but they must be able to interconnect via gateways with real-time text to ensure full interoperability. This will make possible the goal of real-time text being available everywhere.

The R3TF will help facilitate the development of interworking test beds that will enable implementers to test how well their solutions comply with the standard. Moreover, the task force will facilitate the distribution of information about the technology, about the technology’s user requirements, and about the technology’s implementation, and it will act as an educator on related issues.

The Web site of the R3TF is

The R3TF Is for Everyone

What can you do to help the R3TF?

  • Add your knowledge and expertise to those of the R3TF so that the task force can grow real-time text and remove barriers to its implementation.
  • Help prevent SIP networks from blocking real-time text traffic, even if they block VoIP traffic for internal reasons. In SIP networks and products, real-time text support, as described in RFC 4103, should be regarded as normal mainstream and possible now. Real-time text not only works now; it also is part of next-generation-network system specifications.
  • For non-SIP network and products, ensure that the real-time text protocol/service used does interoperate with RFC4103.
  • Build on the open-source client that supports real-time text (as well as VoIP and video) to implement clients for mobile/cellular terminals as well as computers.
  • Include real-time text in your design and development of new services, such as voice and videoconference services, answering machine services, call centre services, language interpretation services, gateway services, network interconnection services, and emergency services. All of those services are enriched to higher usability via support of real-time text together with voice and sometimes video. Open-source components are available and should be continually improved when possible.
  • Include real-time text in 112/911 emergency services when they move to IP networks. Authorities are already pushing for this in the European Union and the United States.
  • Become an R3TF sponsor and encourage employees to participate in projects.


  1. The International Telecommunication Union (ITU) defines real time in ITU-T F.700 Section Real-time text is defined in ITU-T F.700 Annex A.3 and ITU-T F.703 Section
  2. Defined in ITU F.703 Section 7.2 and specified for the next-generation-network IP Multimedia Subsystem in 3rd Generation Partnership Project TS 26.114, Multimedia Telephony, Media Handling and Interaction.
  3. RFC 5194 was published by the IETF as an informational document.
  4. The Real-Time Text Taskforce is a project group separate from the IETF.
Arnoud van Wijk is disability projects coordinator for the Internet Society. Along with other researchers, Arnoud documented the technique for real-time text described in this article, combining existing IETF standards to facilitate text streaming over IP networks. He and Guido Gybels, director of New Technologies at RNID (, with contributions from other experts in communication and accessibility for people with disabilities, edited and coauthored Framework for Real-Time Text over IP Using the Session Initiation Protocol, which the IETF recently published as information document RFC 5194.