Thanks to Christoph K. for this clear and expert post about 3G videocalling
3G video telephony generally operates over a single 64 kbit/s connection where both parties need to share the available bandwidth. Effectively, the application then is left with 60 kbit/s, or less that are dedicated for both media types, since H.245 call control messages reduce the gross bandwidth. In 3G-324M systems, the bandwidth is allocated dynamically; however, generally said, every party has 50% of the bandwidth available for sending audio and video signals. In a typical unidirectional scenario, 12.2 kbit/s are allocated for the speech codec, and a bitrate of 43-48 kbit/s is allowed for the video data (Sang-Bong, Tae-Jung and Jae-Won).
By employing rate control methods in the media encoders, the network can dynamically change these bitrates depending on network conditions and application demand. When two parties communicate simultaneously, the bitrates for the speech and video codec can be reduced in the encoders of both parties, keeping the overall bitrate below 64 kbit/s. For instance, when just one party shows speech activity, the speech bitrate for the other party can be reduced to a minimum where only comfort noise is generated on the receiver side (Holma and Toskala); AMR (Adaptive Multi-Rate) can perform these bitrate changes every 20ms. For video, the encoder can reduce the average bitrate by either reducing the frame rate or simply dropping frames during transmission. To increase the overall frame rate on the receiver side, the decoder can employ H.263 temporal scalability.
In 3G video telephony, the audio and video signals are bidirectionally streamed over dedicated circuit-switched W-CDMA (Wideband Code Division Multiple Access) paths. Streaming describes media is continuously being received or sent and played back on a terminal. Non-conversational one-way audio or video streaming requires a transport delay variation of below 2s (3GPP (3rd Generation Partnership Project)). In contrast, two-way video telephony introduces even higher real-time requirements with an end-to-end, one-way delay of below 150-400ms (3GPP) to maintain a smooth conversation. The overall one-way delay in W-CDMA networks is already approximately 100ms, and it should be noted that in addition to the transmission time, media generation time is required when delivering IVVR (Interactive Voice & Video Response) services. Due to these tight delay requirements, there is no time for retransmission when transmission errors are detected. Retransmission would reduce bit errors and consequently improve video quality, but it would also add undesired delays when resending PDUs. Therefore, to avoid retransmission, H.223 and the media codecs are working hand-in-hand to detect errors, accomplish resynchronisation, and perform error concealment.
By Christoph Köpernick
Check out: Interactive Voice & Video Response (IVVR, Video IVR) Blog