Television – Two-way video and voice communication – Conferencing
Reexamination Certificate
1999-08-26
2001-08-21
Kuntz, Curtis (Department: 2643)
Television
Two-way video and voice communication
Conferencing
C348S014080, C348S014120
Reexamination Certificate
active
06278478
ABSTRACT:
TECHNICAL FIELD
This invention relates generally to audio video teleconferencing between two or more remote parties and, more particularly, relates to a novel hardware acceleration architecture which provides improved audio and video data encoding, decoding, and transmission performance.
BACKGROUND OF THE INVENTION
Video teleconferencing generally involves a meeting between remote parties, whereby each remote party is able to see and hear at least one other remote party. This generally requires the rapid transmission of synchronized audio and video data. Typically, the data that is to be transmitted is first captured and encoded by a sending computer, and then communicated via an electronic data channel to a receiving computer, where the data is received, decoded, and rendered or otherwise manifested to the receiving conference party. Existing schemes for carrying out the aforementioned steps include a hardware device to capture the information to be sent and a separate software encoder to encode the information. Typically, the capture hardware itself is not network aware, and cannot responsively tailor its processing or output to the needs of the receiving computer. Additionally, the encoder is typically a software module running on the host computer, consuming host memory and processor time. At the remote receiving location, the decoder is typically “dumb,” in the sense that it does not interpret or analyze the data stream, but merely performs a decoding function. Additionally, as with the typical encoder, the typical decoder is implemented as a software module running on the host.
This existing system of exchanging audio and video teleconferencing data gives rise to many inefficiencies. During a video teleconference, the computational resources of a computer are often fully utilized, and occasionally exhausted, while the resources of the sending capture device or the receiving video data processor are typically under-utilized. This inefficient allocation of computing resources sometimes leads to deterioration or loss of the video or audio data being transmitted. Additionally, the output of the capture device must sometimes be further processed prior to encoding, causing further inefficiencies. Finally, because existing capture devices are not able to directly access communication channels to their counterparts at a remote location during a teleconference in order to optimize the encoding/decoding process, data transmission is often of less than optimal quality and accuracy.
A system implementing the invention preferably conforms to the appropriate International Telecommunications Union (ITU) standards for multimedia communications over packet-based networks. In particular, the Telecommunications Standardization Sector of the ITU (ITU-T) has published a set of standards under the H.323 designation which include standards for data channels, monitoring channels, and control channels. According to the H.323 group of standards, audio and video data streams to be transmitted are encoded (compressed) and packetized in conformance with a real-time transport protocol (RTP) standard. The packets thus generated include both data and header information. The header information includes information whereby synchronization, loss detection, and status detection are facilitated. Within the H.323 recommendation, video applications may use the H.261, H.262, or H.263 protocols for data transmissions, while audio applications may use the G.711, G.722, G.723.1, G.728, or G.729 protocols. Any class of network which utilizes TCP/IP will generally support H.323 compliant teleconferencing. Examples of such networks include the Internet and many LANs.
An H.323 compliant terminal generally initiates and conducts a communications session via a gatekeeper. Accordingly, although a gatekeeper is not necessary, in a typical teleconference, there may reside a gatekeeper at each of the transmitting and receiving ends. The gatekeeper may perform address translation and bandwidth management, and may serve to map LAN aliases to IP addresses.
Additionally, in order to allow for the exchange of status information between the transmitter and receiver, a real-time transport control protocol (RTCP) channel is opened.
In order to provide control functions, an H.245 control channel is established. This channel supports the exchange of capability information, the opening and closing of data channels, and other control and indication functions.
Although the preferred embodiment will be described in the context of the Microsoft brand Windows Driver Model (WDM), one of skill in the art will appreciate that the invention is not limited to this implementation. The Windows Driver Model is a common set of services which allow the creation of drivers having compatibility between the Microsoft brand Windows 98 operating system and the Microsoft brand Windows 2000 operating system. Each WDM class abstracts many of the common details involved in controlling a class of similar devices. WDM utilizes a layered approach, implementing these common tasks within a WDM “class driver.” Driver vendors may then supply smaller “minidriver” code entities to interface the hardware of interest to the WDM class driver.
WDM provides, among other functionalities, a Stream class driver to support kernel-mode streaming, allowing greater efficiency and reduced latency over user mode streaming. The stream architecture utilizes an interconnected filter organization, and employs the mechanism of “pins” to communicate to and from the filters, and to pass data. Both filters and pins are Component Object Model (COM) objects. The filter is a COM object that performs a specific task, such as transforming data, while a pin is a COM object created by the filter to represent a point of connection for a unidirectional data stream on the filter. Input pins accept data into the filter while output pins provide data to other filters. Filters and pins preferably expose control interfaces that other pins, filters, or applications can use to configure the behavior of those filters and pins. The interface “IBaseFilter” is an example of a filter configuration interface. An embodiment of the invention will be described by reference to the filters and pins of the WDM model hereinafter. For further information regarding the Windows Driver Model, please see
WDM Kernel Streaming Architecture
, available on the Internet at http://www.microsoft.com/Devonly/tech/hardware/desinit/csal.htm, or
Windows Driver Model
(
WDM
)
Technology
, available on the Internet at http://www.microsoft.com/Devonly/tech/hardware/WDM/default.htm.
As illustrated in
FIG. 6
, to control and access the kernel mode streaming data of the WDM architecture, a module such as Microsoft brand Telephony Application Programming Interface 3.0 (TAPI 3.0) running in user mode may be utilized by an application
610
. The TAPI 3.0 COM API is implemented as a suite of COM objects, chiefly Call Control
600
, Media Stream Control
602
, and Directory Control
604
. A Telephony Service Provider (TSP)
606
is responsible for resolving the protocol-independent call model of TAPI into protocol-specific call-control mechanisms, while a Media Stream Provider (MSP)
608
implements Microsoft brand DIRECTSHOW interfaces for a particular TSP. Microsoft brand DIRECTSHOW, part of the WDM, is an architecture which facilitates the control of multimedia data streams via modular components. TAPI 3.0 employs a kernel streaming proxy module such as KSProxy, a Microsoft DIRECTSHOW filter, to control and communicate with kernel mode filters. KSProxy provides a generic method of representing kernel mode streaming filters as DIRECTSHOW filters. Running in user mode, KSProxy accepts existing control interfaces and translates them into input/output control calls to the WDM streaming drivers. TAPI 3.0 may automatically create the WDM filter graph by invoking the appropriate filters and connecting the appropriate pins. For more information regarding TAPI 3.0, see
IP Telephony With TAPI
3.0, available at http://msdn.microsoft.com/librar
Kuntz Curtis
Leydig , Voit & Mayer, Ltd.
Microsoft Corporation
Ramakrishnaiah Melur
LandOfFree
End-to-end network encoding architecture does not yet have a rating. At this time, there are no reviews or comments for this patent.
If you have personal experience with End-to-end network encoding architecture, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and End-to-end network encoding architecture will most certainly appreciate the feedback.
Profile ID: LFUS-PAI-O-2503283