Pulse or digital communications – Bandwidth reduction or expansion – Television or motion video signal
Reexamination Certificate
2000-03-16
2004-01-27
Kelley, Chris (Department: 2613)
Pulse or digital communications
Bandwidth reduction or expansion
Television or motion video signal
C375S240250
Reexamination Certificate
active
06683909
ABSTRACT:
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to video teleconferencing, and in particular, to complying with the maximum-transmission-unit size supported by the underlying transport mechanism.
2. Background Information
A video teleconference, as its name implies, is a conference in which several audio-visual terminals located remotely from each other participate. In one instance, the videoconferencing system allows for the simultaneous exchange of video, audio, and other data between terminals. As
FIG. 1
shows, an example of such a system is a plurality of interconnected terminals
11
,
12
,
15
, and
16
. For the sake of example, the drawing shows the transmission medium as including an Integrated Services Digital Network (ISDN), and a Transport Control Protocol/Internet Protocol (TCP/IP) network. In other words, videoconferencing can be performed by way of packet-switched networks as well as circuit-switched networks. A gateway
22
translates between protocols in the example.
A multipoint control unit (MCU)
20
receives signals from the various terminals, processes these signals in to a form suitable for video teleconferencing, and re-transmits the processed signals to the appropriate terminals. For example, the video signals from the various terminals may be spatially mixed to form a composite video signal that, when it is decoded, may display the various teleconference participants in one terminal. Usually, each terminal has a codec to encode video, audio and/or data signals to send to the MCU for appropriate distribution and to decode such signals from the MCU. Codes for this purpose are we own in the art and are exemplified, for instance, in the International Telecommunication Union (ITU) Telecommunication Standardization Sector recommendation document H.261 (ITU-T Recommendation H.261).
The Telecommunication Standardization Sector of the International Telecommunication Union (ITU-T) is responsible for standardizing the technical aspects of telecommunication on a worldwide basis. Its H-series recommendations concern video teleconferencing H-series. Among other H-series recommendations, H.221 defines frame structure, H.261 defines video coding and decoding, H.231 defines multipoint control units (MCUs), H.320 defines audio-visual terminals, and H.323 defines audio-visual terminals that do not provide a guaranteed quality of service. How the various devices in the video teleconferencing system interact with each other using the various recommendations are now briefly described.
The H.320 terminals employed in the system transmit H.221 frames of multiplexed audio-video and data information. (These frames should not be confused with video frames, which we will hereafter refer to as “pictures” to distinguish them from transmission frames.) Each frame consists of one or more channels, each of which comprises 80 octets of bits, and each of the 8 octet bit positions can be thought of as a separate sub-channel within the frame. In general, certain bits of a given octet will contain video information, certain bits will contain audio information, and certain bits may contain data, as FIG.
2
's first row illustrates. Additionally, the eighth bit in certain of a frame's octets (not shown in the drawings) represents control information by which, among other things, frame boundaries can be recognized. The precise bit allocation is determined through a session negotiation process among the involved video teleconferencing terminals.
The H.323 terminals employed in the system use the real-time transmission protocol (RTP), known to one skilled in the art, and set forth in the Request For Comments (RFC) 1889. RFCs are published by the Internet Engineering Task Force (IETF), a community dedicated to standardizing various aspects of the Internet. An H.323 terminal uses separate RTP sessions to communicate the conference's video and audio portions. Thus, as FIG.
2
's first through third rows show, a gateway's option of translating from H.221 to RTP involves demultiplexing the H.221 data stream into its video, audio, and data constituents so that the gateway can packetize the video, audio, and data separately. In particular, video bits are extracted from a session of octets and concentrated into a stream that contains only the H.221 transmission's video parts. The stream is encoded in accordance with H.261 recommendation at the terminal using a codec. Note that the encoding may be in accordance with a related H.263 recommendation. However, the H.261 recommendation will generally be focused on here.
FIG. 3
illustrates a typical link layer packet suitable for transmission in accordance with the RTP protocol. If Ethernet is used for the link layer, information is sent to an Ethernet frame that begins and ends with an Ethernet header and trailer, which are used for sending the information to the next stop on the same local network. The frame's contents are in IP datagram, which also includes its own header, specified in RFC 791, for directing the datagram to its ultimate internetwork address. In video conference situations, RTP permits TCP to be used as the transport protocol (i.e., as the protocol for directing the information to the desired application at the destination internet address). However, the User Datagram Protocol (UDP) is preferable to TCP for videoconferencing because TCP's re-transmission of lost video streams is unnecessary under these situations. Thus,
FIG. 3
depicts the IP payload as a UDP datagram and includes a UDP header as specified in RFC 768.
Because packet-switched protocol data units do not in general arrive in order, and because real-time information must be presented in a predetermined time sequence, the UDP payload must include information specifying the sequence in which the information was sent and its real-time relationship to other packets. So the payload begins with an RTP header, specified in RFC 1889, that gives this and other information.
The RTP header format, depicted in
FIG. 4
, is shown as successive four-byte rows. RFC 1889 describes the various
FIG. 4
fields' purposes in detail, so only the timestamp field is mentioned here. When information travels by way of a packet-switched network, different constituent packets make their ways to their common destination independently. That is, different packets can take different routes, so the times required for different packets to arrive at their respective destinations are not in general the same, and packets can arrive out of sequence or in time relationships that otherwise differ from those with which their contained information was generated. RTP therefore provides for a timestamp in each packet to indicate the real-time relationships with which the information is to be played. Typically, gateways and H.323 devices (e.g., terminals and MCUs) use a local clock to provide the RTP-required timestamp as they assemble H.261 packets.
However, it would be complicated to play the resultant timestamped information if no notice were taken of the actual contents of the data stream being packetized. For example, a single packet could contain parts of two different video pictures, so parts of the same picture would have the same timestamp, while different parts would have different timestamps. To avoid this, the packets need to be monitored for picture boundaries.
FIG.
2
's fourth through seventh rows depict the structure that the incoming data stream uses to represent successive video pictures in accordance with H.261. The fourth row illustrates a data-stream portion covering a single video picture. It shows that the portion begins with a header, and
FIG. 5
illustrates that header's structure.
The header field of importance here is the Picture Start Code (PSC). For H.261 streams, that field value is always 00010
H
, a sequence that cannot occur elsewhere in the data stream. If a length of a single-picture portion of the data stream exceeds the underlying protocol's maximum-transmission-unit size, the H.323 device
Cesari and McKenna LLP
Czekaj David
Ezenial Inc.
Kelley Chris
LandOfFree
Macroblock parsing without processing overhead does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with Macroblock parsing without processing overhead, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Macroblock parsing without processing overhead will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-3239781